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ABSTRACT 


We report on Bayesian estimation of the radius, mass, and hot surface regions of the massive millisec- 
ond pulsar PSR J07404-6620, conditional on pulse-profile modeling of Neutron Star Interior Composi- 
tion Explorer X-ray Timing Instrument (NICER XTI) event data. We condition on informative pulsar 
mass, distance, and orbital inclination priors derived from the joint NANOGrav and CHIME/Pulsar 
wideband radio timing measurements of Fonseca et al. (2021a). We use XMM-Newton European 
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Photon Imaging Camera spectroscopic event data to inform our X-ray likelihood function. The prior 
support of the pulsar radius is truncated at 16 km to ensure coverage of current dense matter models. 
We assume conservative priors on instrument calibration uncertainty. We constrain the equatorial ra- 
dius and mass of PSR J07404-6620 to be 12.39* 52$ km and 2.072*0 066 Mo respectively, each reported 
as the posterior credible interval bounded by the 16% and 84% quantiles, conditional on surface hot 
regions that are non-overlapping spherical caps of fully-ionized hydrogen atmosphere with uniform 
effective temperature; a posteriori, the temperature is logio (T [K]) = 5.99+9:9% for each hot region. 
All software for the X-ray modeling framework is open-source and all data, model, and sample infor- 
mation is publicly available, including analysis notebooks and model modules in the Python language. 
Our marginal likelihood function of mass and equatorial radius is proportional to the marginal joint 
posterior density of those parameters (within the prior support) and can thus be computed from the 


posterior samples. 


Keywords: dense matter — equation of state — pulsars: general — pulsars: individual 
(PSR J0740+6620) — stars: neutron — X-rays: stars 


1. INTRODUCTION 


The nature of supranuclear density matter, as found 
in neutron star cores, is highly uncertain. Possibilities 
include both neutron-rich nucleonic matter and stable 
states of strange matter in the form of hyperons or 
deconfined quarks (for recent reviews see Oertel et al. 
2017; Baym et al. 2018; Tolos & Fabbietti 2020; Yang & 
Piekarewicz 2020; Hebeler 2021). One way to determine 
the dense matter Equation of State (the EOS, a func- 
tion of both composition and inter-particle interactions) 
is to measure neutron star masses and radii (Lattimer 
& Prakash 2016; Ozel & Freire 2016). There are several 
possible methods, but in this Letter we focus on pulse- 
profile modeling (see Watts et al. 2016; Watts 2019, and 
references therein). This requires precise phase-resolved 
spectroscopy, a technique that motivated the design and 
development of NASA’s Neutron Star Interior Compo- 
sition Explorer (NICER). 

The NICER X-ray Timing Instrument (XTI) is a pay- 
load installed on the International Space Station. The 
primary observations carried out by NICER are order 
megasecond exposures of rotation-powered X-ray mil- 
lisecond pulsars (MSPs) that may be either isolated or 
in a binary system (Bogdanov et al. 2019a). Surface X- 
ray emission from the heated magnetic poles propagates 
to the NICER XTI through the curved spacetime of the 
pulsar, and the compactness affects the signal registered 
by the instrument. However, these pulsars also spin at 
relativistic rates. So with a precisely measured spin 
frequency derived from radio timing and high-quality 
spectral-timing event data, we are also sensitive to rota- 
tional effects on surface X-ray emission, and therefore to 
the radius of the pulsar independent of the compactness 
(see Bogdanov et al. 2019b, and references therein). 


The first joint mass and radius inferences conditional! 
on pulse-profile modeling of NICER observations of a 
MSP were reported by Miller et al. (2019) and Riley 
et al. (2019). 

The target was PSR. J0030+0451, an isolated? source 
spinning at approximately 205 Hz. Being isolated, the 
radio timing model for this MSP has no dependence 
on its mass, in contrast to the radio timing model for 
an MSP in a binary. This meant that a wide prior on 
the mass had to be assumed in the pulse-profile mod- 
eling, which nevertheless - due to the high quality of 
the data set in terms of the number of pulsed counts - 
delivered credible intervals on the mass and radius pos- 
teriors at the ~ 10% level. These posterior distributions 
have been used to infer properties of the dense matter 
EOS (in combination with constraints from radio tim- 
ing, gravitational wave observations, and nuclear physics 
experiments). To give a few examples, there have been 
follow-on studies constraining both parameterized EOS 
models (Miller et al. 2019; Raaijmakers et al. 2019, 2020; 
Dietrich et al. 2020; Jiang et al. 2020; Al-Mamun et al. 
2021) and non-parameterized EOS models (Essick et al. 
2020; Landry et al. 2020), some focusing particularly on 
the neutron star maximum mass (Lim et al. 2020; Tews 
et al. 2021). Others have focused on specific nuclear 
physics questions: hybrid stars and phase transitions to 
quark matter (Tang et al. 2021; Li et al. 2020; Christian 
& Schaffner-Bielich 2020; Xie & Li 2021; Blaschke et al. 
2020; Alvarez-Castillo et al. 2020); the three nucleon po- 
tential (Maselli et al. 2021); relativistic mean-field mod- 
els (Traversi et al. 2020); muon fraction content (Zhang 


1 For an introduction to the concept of conditional probabilities 
within Bayesian inference see Sivia & Skilling (2006); Trotta 
(2008); Hogg (2012); Gelman et al. (2013); Clyde et al. (2021); 
Hogg et al. (2020). 

? No binary companion has ever been detected despite 20 years of 


intensive radio timing (Lommen et al. 2000; Arzoumanian et al. 
2018). 
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& Li 2020a); and the nuclear symmetry energy (Zim- 
merman et al. 2020; Biswas et al. 2021). This is by no 
means an exhaustive review of the citing literature, but 
serves to give a flavor of how the previous NICER result 
has been used. 

Pulse-profile modeling also yields posterior distribu- 
tions for the properties of the hot X-ray emitting re- 
gions on the star's surface, which are assumed to be 
related to the star's magnetic field structure (Pavlov & 
Zavlin 1997). The analysis of PSR J0030--0451 implied 
a complex non-dipolar field (Bilous et al. 2019) and the 
posteriors have been used in follow-on studies of pulsar 
magnetospheres and radiation mechanisms (Chen et al. 
2020; Suvorov & Melatos 2020; Kalapotharakos et al. 
2021). 

'The subject of this Letter is the rotation-powered mil- 
lisecond pulsar PSR J07404-6620, spinning at approx- 
imately 346 Hz as it orbits with a binary companion 
(Cromartie et al. 2020). Being in a binary at a favor- 
able inclination for measurement of the Shapiro delay 
allows the mass of this source to be measured indepen- 
dently via radio timing (e.g., Lorimer & Kramer 2004). 
Cromartie et al. (2020) reported a mass of 2.14* 0:19 Mo, 
making this the highest (well-constrained) mass neutron 
star. High-mass neutron stars (with the highest central 
densities) are particularly powerful in terms of their po- 
tential to constrain the dense matter EOS. The mass 
alone can cut down parameter space, but a measure- 
ment of radius adds far more (see for example Han & 
Prakash 2020; Xie & Li 2020). 

The North American Nanohertz Observatory for 
Gravitational Waves (NANOGrav) and Canadian Hy- 
drogen Intensity Mapping Experiment (CHIME) Pulsar 
collaborations recently joined forces to perform wide- 
band radio timing of PSR J07404-6620 (Fonseca et al. 
2021a). They derived an updated measurement of the 
pulsar mass (2.08 + 0.07 Ma), its distance from Earth, 
and the orbital inclination. From these informative mea- 
surements and NICER observations (Wolff et al. 2021) 
comes the potential for synergistic constraints on X-ray 
pulse-profile parameters that do not appear in the wide- 
band radio timing solution; in particular, the radius of 
PSR. J07404-6620 and the properties of the hot surface 
X-ray emitting regions conditional on a model. We re- 
port such inferences in this Letter. 

We organize this Letter as follows. In Section 2 we 
summarize the pulse-profile model components; we pro- 
vide additional detail about novel model components to 
augment the information in Riley et al. (2019) and Bog- 
danov et al. (2019b). Section 2 also covers the X-ray like- 
lihood function—which is the probability of the NICER 
XTI event data and the spectroscopic event data ac- 
quired by the XMM-Newton European Photon Imaging 
Camera (EPIC)—and details about the prior probability 
density functions of model parameters that are funda- 
mentally shared by all models or multiple models. In 
Section 3 we report model inferences and details about 


the prior probability density functions that are specific 
to a given surface hot region model. In Section 4 we 
discuss these inferences in detail, covering their physical 
implications and potential systematic errors. In Sec- 
tion 5 we conclude by reporting the mass and radius 
constraints and commenting on the outlook for future 
pulse-profile modeling efforts. EOS inference using our 
derived mass-radius posterior is carried out in a com- 
panion paper (Raaijmakers et al. 2021). 


2. MODELING PROCEDURE 


'The methodology in this Letter is largely shared with 
that of Riley et al. (2019), Bogdanov et al. (2019b), and 
Bogdanov et al. (2021). In this section, we summarize 
that methodology and give a more detailed report of the 
new model components. 

We formulate the general form of the likelihood func- 
tion shared by all models and also the prior probability 
density functions (PDFs) of parameters that are shared 
by all models. We reserve definitions of prior PDFs 
of phenomenological surface hot region parameters for 
Section 3 where posterior inferences are reported. All 
posterior PDFs are computed using the X-ray Pulse 
Simulation and Inference (X-PSI) v0.7 framework (Ri- 
ley 2021), an updated version of the package used by 
Riley et al. (2019).? The analysis files may be found 
in the persistent repository of Riley et al. (2021): the 
data products; the numeric model files including the 
telescope calibration products; model modules in the 
Python language using the X-PSI framework; posterior 
sample files; and Jupyter analysis notebooks. We begin 
by introducing the data sets used in the analysis, includ- 
ing the most important aspects of the data selection and 
preparation. 


2.1. X-ray event data 


In this section we summarize the event data sets 
reported by Wolff et al. (2021), including any pre- 
processing tailored to this present Letter. 


2.1.1. NICER XTI 


'The PSR J07404-6620 analysis is based on a sequence 
of exposures with NICER XTI (hereafter NICER) ac- 
quired in the period 2018 September 21 — 2020 April 17. 
The event data was obtained using similar filtering cri- 
teria to the previously analyzed NICER data set of PSR 
J0030--0451. We only used good time intervals when all 
52 active detectors were collecting data but rejected all 
events from DetID 34, as it often shows enhanced count 
rates relative to the other 51 detectors. We excluded 
time intervals when PSR J07404-6620 was situated at 
an angle € 80? from the Sun to reduce the increase in 
background in the lowest channels due to optical load- 
ing. We further excised events collected at low cut-off 


3 https://github.com/ThomasEdwardhRiley /xpsi 
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rigidity (COR.SAX values « 5) to minimize particle 
background contamination. The resulting cleaned event 
list has an on-source exposure time of 1602683.761 s. 

We model events registered in the PI channel subset 
[30, 150), corresponding to the nominal photon energy 
range [0.3, 1.5] keV.4 The quoted nominal photon energy 
for a channel is the energy mid-point for that channel. 

Below nominal energy 0.3 keV, the instrument cali- 
bration products have greater a priori uncertainty due 
to the sharp lower-energy threshold cutoff, as well as 
the presence of unrejected instrumental noise that can 
extend above 0.25 keV. We therefore neglect the infor- 
mation below channel 30 in order to reduce the risk of in- 
ferential bias. Wolff et al. (2021) detect pulsations with 
the highest significance when considering event data up 
to nominal energy ~ 1.2 keV; in this Letter we include 
the additional information at nominal energies in the 
range [1.2, 1.5] keV. The number of counts generated 
by the PSR J0740--6620 hot regions in channels 150 
and above, however, is small and diminishes relative to 
the counts generated by background processes (includ- 
ing a non-thermal component from the environment in 
the near-vicinity of PSR J07404-6620); we therefore ne- 
glect this higher-energy information, and focus energy 
resolution at nominal energies below 1.5 keV. 

'The NICER event data are phase-folded according to 
the NANOGrav radio timing solution presented in Fon- 
seca et al. (2021a). The phase-binned count numbers 
are displayed in Figure 1. 


2.1.2. XMM-Newton EPIC 


The XMM-Newton (hereafter XMM) telescope ob- 
served PSR J0740--6620 as part of a Director's Dis- 
cretionary Time program in three visits: 2019 Octo- 
ber 26 (ObsID 0851181601), 2019 October 28 (Ob- 
sID 0851181401), and 2019 November 1 (ObsID 
0851181501). The European Photon Imaging Camera 
(EPIC) instruments (pn, MOS1, and MOS2) were em- 
ployed in ‘Full Frame’ imaging mode with the ‘Thin’ op- 
tical blocking filters in place. Due to the insufficiently 
fast read-out times (73.4 ms for EPIC-pn and 2.6 s for 
EPIC-MOS1/2), the data do not provide useful pulse 
timing information, so only phase-averaged spectral in- 
formation is available. The XMM data were reduced 
using the Science Analysis Software (SAS?) using the 
standard set of analysis threads. 

'The event data were first screened for periods of strong 
soft proton background flaring. The resulting event 
lists were then further cleaned by applying the recom- 


^ A photon that deposits all of its energy, E € [0.3,1.5) keV, in 
a detector with perfect energy resolution is registered in channel 
subset [30,150) after mapping according to the gain-scale cali- 
bration product. 

5 The XMM-Newton SAS is developed and maintained by the Sci- 
ence Operations Centre at the European Space Astronomy Cen- 
tre and the Survey Science Centre at the University of Leicester. 


mended PATTERN (x 12 for MOS1/2 and < 4 for 
pn) and FLAG (0) filters. The final source event lists 
were obtained by extracting events from a circular re- 
gion of radius 25" centered on the radio timing position 
of PSR J07404-6620. This resulted in effective exposure 
times of 6.81, 17.96, and 18.7 ks for the pn, MOSI, and 
MOS2 instruments, respectively. 


2.2. Radiation propagation from surface to telescope 
2.2.1. Design of the equatorial radius prior 


One of the ultimate aims of this Letter is to report a 
likelihood function of mass and (equatorial) radius that 
can be used for EOS posterior computation. As high- 
lighted by Riley et al. (2018), it is desirable to define 
a prior PDF which is jointly flat with respect to two 
parameters within the prior support; these parameters 
can simply be mass and radius, or some deterministi- 
cally related variables, such as mass and compactness. 
'The marginal joint posterior PDF of these parameters is 
then proportional to the marginal likelihood function of 
these parameters, meaning that the marginal likelihood 
function can be estimated from posterior samples, for 
use in subsequent inferential analyses. 

In this Letter we follow Riley et al. (2019) by defin- 
ing a joint prior PDF of mass and radius that is flat 
within the prior support, which is maximally inclusive 
in regards to theoretical EOS predictions. The prior 
support is zero for R > 16 km because we are not aware 
of any EOS models predicting a radius higher than this 
limit that would be compatible with current constraints 
from nuclear physics, or with the constraints posed by 
the gravitational wave measurement of tidal deformabil- 
ity for the binary neutron star merger GW170817 (see, 
e.g., Reed et al. 2021). A difference to Riley et al. (2019) 
is that we define the prior support using a higher com- 
pactness limit as discussed in Section 2.2.3 (see Table 1 
for the prior PDF and support). The prior support is 
also subject to the condition that the effective gravity 
lies within a bounded range at every point of the rotat- 
ing oblate surface, in order to conform to bounds on the 
atmosphere models we condition on (see Section 2.3.2 
and Table 1). 

In Section 2.2.2 we introduce information about the 
mass in the form of a marginal PDF whose shape ap- 
proximates the marginal likelihood function of mass de- 
rived in a recent radio timing analysis; we multiply this 
likelihood function with the jointly flat prior PDF of 
mass and radius described above, thereby defining an 
updated prior PDF for the X-ray pulse-profile modeling 
in this Letter. Our prior PDF for pulse-profile modeling 
is therefore not jointly flat with respect to mass and ra- 
dius, but all shape information is likelihood-based, sat- 
isfying the requirements of Riley et al. (2018). 


2.2.2. Mass, inclination, and distance priors 


Radiation is transported from the oblate X-ray emit- 
ting surface to distant static telescopes by relativistic 
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Figure 1. Phase-folded PSR J07404-6620 event data for two rotational cycles (for clarity): we use 32 phase intervals (bins) 
per cycle and the count numbers in bins separated by one cycle—in a given channel—are identical. The total number of 


counts is given by the sum over all phase-channel pairs. The top panel displays the pulse-profile summed over the contiguous 


subset of channels [30,150). We use the red bar to indicate the typical standard deviation that an adequately performing 


Poisson count model will exhibit. The bottom panel displays the phase-channel resolved count numbers for channel subset 


[30, 150). For likelihood function evaluation we group all event data registered in a given channel into phase intervals spanning 


a single rotational cycle. The description in this caption is based on that given by Riley et al. (2019) about the corresponding 


count-number figure for PSR J0030--0451. 


ray-tracing, as described by Bogdanov et al. (2019b) 
and references therein. The gravitational mass of 
PSR J07404-6620, the distance of PSR J07404-6620 from 
Earth, and the inclination of the Earth's line-of-sight to 
the PSR J07404-6620 spin axis are all parameters of the 
source-receiver system that need to be specified in or- 
der to simulate an X-ray signal incident on a telescope, 
and can be inferred via pulsar radio timing. Note that 
these static model X-ray telescopes are fictitious con- 
structs: the real telescopes are in motion relative to the 
pulsar (see, e.g., Riley 2019). The NICER event data 
pre-processing operations include the phase-folding of 
on-board arrival times according to a NANOGrav radio 
timing solution; the phase-folded events are then the 
events that would be registered by a telescope that is 
static relative to the pulsar. 

The NANOGrav and CHIME/Pulsar collabora- 
tions jointly performed wideband radio timing of 
PSR J0740+6620 (Fonseca et al. 2021a,b). A product 


of this radio timing work was a joint posterior PDF of 
the pulsar gravitational mass, Earth distance, and the 
inclination Earth subtends to the orbital direction? that 
we can condition on as an informative prior PDF for 
joint NICER and XMM X-ray pulse-profile modeling. 
The synergy—due to degeneracy breaking—between ra- 
dio timing and X-ray modeling yields a higher sensitiv- 
ity to the pulsar radius and the parameters of a sur- 
face X-ray emission model. We used two sets of joint 
posterior PDFs provided by Fonseca et al. (2021a) as 
prior information over the course of our analysis: one set 
that was numerically estimated using a weighted least- 
squares solver, used in this Letter for an exploratory 
analysis (Section 3.2); and a second that instead used 


6 The pulsar spin angular momentum is assumed to be parallel to 
the orbital angular momentum, but we also test sensitivity to an 


isotropic spin-direction prior. 
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a generalized least-squares (GLS) algorithm for deter- 
mining timing model parameters from wideband timing 
data, used in this Letter for a production analysis (Sec- 
tion 3.3). Fonseca et al. (2021a) report on results ob- 
tained with GLS fitting of wideband timing data, though 
they publicly provide both sets of PDFs as they were 
used in this work. 

Systematic error in the parameter estimates by 
NANOGrav and CHIME/Pulsar is dominated by sensi- 
tivity to the choice of the dispersion-measure variability 
(DMX) model. Posterior PDFs were computed for three 
DMX variants, and the systematic error implied by the 
posterior variation was substantially smaller than the 
formal posterior spread conditional on any one DMX 
model. We average (marginalize) the posterior PDFs 
over the three DMX models with a uniform weighting." 

'The prior PDFs we implement for every model that 
conditions on joint NANOGrav and CHIME/Pulsar 
wideband radio timing are as follows. The distance is 
separable from the pulsar mass and the Earth inclination 
to the orbital axis, but the mass and (cosine of the) in- 
clination are correlated. For the distance, we implement 
the NANOGrav x CHIME/Pulsar measurement by first 
marginalizing the PDFs over the DMX models and then 
reweighting from the flat distance prior conditioned on 
by NANOGrav and CHIME/Pulsar to a physical dis- 
tance prior in the direction of PSR J07404-6620 follow- 
ing the exemplar treatment of distance information by 
Igoshev et al. (2016).5 The physical distance prior, how- 
ever, remains relatively flat in the context of the likeli- 
hood function—meaning the likelihood function is dom- 
inant in the measurement—and the modification to the 
original DMX-averaged PDF is entirely minor. Addi- 
tional distance likelihood information derives from the 
Shklovskii effect which effectively truncates the PDF, 
putting an upper limit on the distance (Shklovskii 1970); 
such truncation however occurs well into the tail of the 
NANOGrav x CHIME/Pulsar distance distribution, so 
it is also an unimportant detail. Finally, we approximate 
the NANOGrav x CHIME/Pulsar marginal PDF of dis- 
tance as a skewed and truncated Gaussian distribution 
(please see Table 1 for details). We display the marginal 
distance PDF variants referenced above in Figure 2. 

For the mass and the (cosine of the) orbital inclina- 
tion we implement a multi-variate Gaussian distribution 
with the covariance matrix of the DMX-averaged PDF. 
Implicit in this PDF is a prior PDF defined by the ra- 
dio timing collaboration. The marginal prior PDF of 
the pulsar mass is not flat: the prior PDF of the cosine 


of the orbital inclination and of the mass of the white 
dwarf companion of PSR J0740+6620 is jointly flat, and 
these variables map deterministically to the pulsar mass 
through the binary mass function and (precise) radio 
timing of the system. We do not modify the joint prior 
PDF of the pulsar mass and the inclination despite the 
likelihood function of the pulsar mass (marginalized over 
all other radio timing parameters) being formally desir- 
able for EOS parameter estimation (see Section 2.2.1 
and Riley et al. 2018). In this case the posteriors are 
sufficiently dominated by the likelihood function to ig- 
nore the small structural modifications that would re- 
sult from tweaking relatively diffuse priors. Moreover, 
changing the pulsar mass prior PDF would change the 
prior PDF of the companion, which may be undesirable. 

Ultimately, handling the detailed structure? of these 
prior PDFs—i.e., beyond the location and marginal 
spread of the parameters—is not important because the 
NICER likelihood function is not sufficiently sensitive 
to some combination of these radio-timing parameters 
and its native parameters (such as equatorial radius) 
for posterior inferences to change to any discernable de- 
gree. That is, the changes will be difficult to resolve 
from sampling noise and implementation error for in- 
stance (Higson et al. 2018, 2019), and relative to model- 
to-model systematic variations that, when marginalized 
over, further broaden the posteriors of shared param- 
eters of interest. Also note that the distance only en- 
ters in the likelihood function in combination with the 
effective-area scaling parameter defined in Section 2.4 
that operates on all X-ray instrument response mod- 
els because of global calibration error; the distance to 
PSR J0740+6620 could therefore be combined with this 
absolute scaling parameter, further diminishing the im- 
portance of treating fine details in the prior PDF of the 
distance. 


2.2.3. Relativistic ray-tracing 


The original version of the X-PSI package used to 
model PSR J0030+0451 (Riley et al. 2019) via relativis- 
tic ray-tracing assumed that no more than one ray con- 
nected the telescope to each point on the stellar surface. 
However, photons emitted from a star with compactness 
greater than GM/Rc? = 0.284 can have a deflection an- 
gle larger than 7, which makes multiple images of small 
regions at the “back” of the star possible. This was not 
an issue for the analysis of PSR J00304-0451, because 
the 9596 credible region only allowed compactness val- 
ues in the range of GM/Rc? < 0.171. Additionally, the 


7 Equivalent to choosing a prior mass function of the DMX models 
that yields posterior probability ratios of unity between those 
models. 

8 The ecliptic coordinates of PSR J07404-6620 reported by Arzou- 
manian et al. (2018) were transformed to galactic coordinates 
using Astropy (http://www.astropy.org; Astropy Collaboration 
et al. 2013, 2018). 


9 Please see Figure 2 for reference. The form of the DMX- 
marginalized joint mass and inclination prior PDF is the domi- 
nating factor in the posterior PDF rendered in Figure 7; for sup- 
plementary plots of the variation of the joint prior PDF of mass 
and inclination with DMX model, please refer to the analysis 
notebooks released with this Letter, in which the model compo- 
nents are constructed. 
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Figure 2. Implementation of the joint NANOGrav and 
CHIME/Pulsar posterior PDF of distance (Fonseca et al. 
2021a) as a prior PDF in the context of joint NICER 
and XMM X-ray pulse-profile modeling. The red distri- 
butions are marginal posterior PDFs of the distance de- 
rived by NANOGrav and CHIME/Pulsar, conditional on 
a flat prior PDF; the solid red distribution is the poste- 
rior PDF marginalized over the three DMX models, and 
each of the lightweight red distributions is a posterior PDF 
conditional on a particular DMX model. These posterior 
PDFs of distance are proportional to the marginal like- 
lihood function of distance. The black dash-dot PDF is 
the prior distribution of galactic pulsars in the direction of 
PSR. J0740+6620, adopted from Igoshev et al. (2016) and 
renormalized to the interval D € [0.6, 2.0] kpc on which the 
NANOGrav x CHIME/Pulsar PDF is supported. The black 
solid distribution is the posterior PDF of distance condi- 
tional on the black dash-dot prior PDF. The blue dashed 
distribution is an approximating PDF that we condition 
on as a prior PDF for the pulse-profile modeling in this 
Letter, after renormalizing to be supported on the interval 
D € (0.0, 1.7] kpc; please refer to Table 1 for details needed 
to reproduce this approximating PDF. 


independent analysis by Miller et al. (2019) allowed for 
the possibility of multiply-imaged regions on the star 
but still led to a credible range on compactness that is 
too small to allow for multiple images. 

The radio observations of Shapiro delay in the 
PSR J07404-6620 binary system by NANOGrav and 
CHIME/Pulsar (Fonseca et al. 2021a) lead to the 
marginal 68% credible interval of M = 2.08 + 0.07 Mo. 
For reference, a small selection of EOS that cover a real- 
istic range of stiffness are shown in Figure 3. The EOS 
shown include a set of three EOS (HLPS soft, intermedi- 


ate, and stiff) constructed by Hebeler et al. (2013) that 
span a range of stiffness allowed by nuclear experiments, 
as well as the A184-ó(v)--UIX* EOS (Akmal et al. 1998) 
(abbreviated to APR). Each of the four curves shows the 
values of the equatorial compactness and mass for stars 
rotating at the rate of 346 Hz, computed using the code 
rns (Stergioulas & Friedman 1995). Figure 3 shows that 
for this set of EOS, the large mass values (> 2.0M) lead 
to a significant range of models that have large enough 
compactness (i.e., are above the horizontal line at 0.284) 
that multiple images of some part of the star are pos- 
sible. Because the stars are oblate, the compactness at 
the spin poles is larger than the equatorial compactness, 
so the range of multiply-imaged stars extends to slightly 
lower values of compactness, however the change is not 
perceptible on this figure. 

Due to the expectation that PSR J07404-6620 could 
be very compact, the X-PSI code was extended to al- 
low multiple rays from any point to reach the telescope. 
'This improvement to X-PSI as well as details of code 
validation are discussed in Bogdanov et al. (2021). X- 
PSI sums over the primary and the visible higher-order 
images of any regions on a star that lie in a multiply- 
imaged region. The X-PSI package can typically detect 
up to the quaternary, quinary, or senary order depend- 
ing on resolution settings. In practice only secondary 
images, and potentially the tertiary images for some 
configurations, might be important; but omission of sec- 
ondary images can lead to large errors of O(10%) in the 
light-curve calculation (see, e.g., the X-PSI documenta- 
tion!?). 


2.2.4. Interstellar attenuation of X-rays 


'The X-ray signal is attenuated to some degree by the 
intervening interstellar medium along the line of sight 
to the pulsar. In all models, the attenuation physics 
is parameterized solely by the neutral hydrogen column 
density Ng, relative to which the abundances of all other 
attenuating gaseous elements, dust, and grains are fixed 
by the state-of-the-art TBabs model (Wilms et al. 2000, 
updated in 2016). We implement this attenuation model 
as a one-dimensional lookup table with respect to energy 
at a fiducial column density, and then raise the atten- 
uation factor to the power of the ratio of the column 
density to the fiducial value. 

The column density along the line of sight to 
PSR J07404-6620 can be estimated using several tech- 
niques. The HEASARC neutral hydrogen map tool!! 


10 https://thomasedwardriley.github.io/xpsi/multiple. imaging. 
html 
11 https:/ /heasarc.gsfc.nasa.gov/cgi- bin/Tools/w3nh/w3nh.pl 
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Figure 3. Equatorial compactness (GM/ Rc?) versus mass 
for stars spinning at a rate of 346 Hz. The relations for four 
example EOS models (see text for description) are shown. 
The 9596 interval for the mass prior is shaded, as is the re- 
gion above compactness 0.284 where multiple images of the 
equator can appear. Note that for a range of compactness be- 
low 0.284, non-equatorial surface regions are multiply imaged 
due to oblateness. For the high masses of interest, multiple 
imaging is clearly relevant. 


yields Ng z 3.5 x 102° cm~? given the most modern 
map (HI4PI Collaboration et al. 2016).'?. 

Using the relation between dispersion measure (DM) 
and Ny from He et al. (2013) together with the DM 
measurement reported by Cromartie et al. (2020), Ng z 
4.5 x 107 cm-?. Finally, estimates based on 3D 
E (B — V) extinction maps (Lallement et al. 2018), to- 
gether with the relation between extinction and Ng 
from Foight et al. (2016), yield values of Ng z 4.5 x 
102° cm-?. 

Given that Ny could be as large as 4.5 x 107? cm7? 
based on these estimates, and given the large uncer- 
tainties in the relations between Ng and other quan- 
tities such as the DM, a conservative prior of Ng ~ 
U (0, 10?!) cm~? is warranted. 


2.3. Surface hot regions 
2.9.1. Temperature field 


'The surface effective temperature field of a rotation- 
powered MSP is one of—if not the—primary source of 
uncertainty a priori regarding the physical processes 
that generate the X-ray event data. The image of a 


neutron star cannot be resolved with any current X-ray 
telescope, so all our knowledge about the surface tem- 
perature field comes from models. These models heavily 
rely on the assumptions about magnetic field configura- 
tion, which is a crucial part in calculating heating by 
space-like or return current in some sub-region of the 
open field line footprints (Kalapotharakos et al. 2021) or 
anisotropic thermal conductivity of the outer star layers 
(e.g., Kondratyev et al. 2020; De Grandis et al. 2020). 
There is, therefore, an essentially unknown degree of 
complexity in the structure of the temperature field. 

A given likelihood function is insensitive to complex- 
ity beyond some degree. We focus on a simple model 
of the surface temperature field: two disjoint hot re- 
gions that are simply-connected spherical caps (when 
projected onto the unit sphere) wherein the effective 
temperature of the atmosphere is uniform. This model 
may be identified as ST-U in Riley et al. (2019). The 
phase-folded NICER pulse-profile is suggestive of two 
phase-separated hot regions which may or may not be 
disjoint. 

We could begin with an even simpler model that re- 
stricts the hot regions to be related via antipodal re- 
flection symmetry (ST-S in Riley et al. 2019). Comput- 
ing a posterior conditional on such a simple surface hot 
region model is justifiable, and it also delivers a lower- 
dimensional target distribution to check for egregious 
model implementation error (as reasoned by Riley et al. 
2019). However, for the analysis reported in this Let- 
ter, we immediately explored the ST-U model because 
antipodal reflection symmetry is physically unrealistic 
and the ST-U model is sufficiently simple and inexpen- 
sive to condition on; nevertheless, if one is interested in 
the question of whether we are sensitive to deviations 
from this symmetry, we open-source the entire analysis 
package for this Letter, and the online X-PSI documen- 
tation offers guidance on how to condition on the ST-S 
model. 

Nodes can be added to the model space by incre- 
menting the complexity of the surface hot regions, fol- 
lowing Riley et al. (2019), wherein symmetries between 
hot regions are broken!’ and temperature components 
are added.!^ In this Letter we introduce a new binary 
flag for the atmosphere composition (hydrogen or he- 
lium as discussed in Section 2.3.2) within the (single- 
temperature) hot regions. However, we consider only 
one higher-complexity model than ST-U. We opt to 
do this because subject to limited (computational) re- 


13'That is, the breaking of antipodal reflection symmetry and of 


shape symmetry. 


14 All models in Riley et al. (2019), two of which we condition on in 


1? Neutral hydrogen maps provide an integrated estimate along the 
line of sight through the Milky Way, and can therefore be inter- 
preted as an approximate upper limit. 


this Letter, can be labeled by the number of disjoint hot regions 
they define. Note that in practice the prior PDFs of tempera- 
ture and solid angle subtended at the stellar center are diffuse 
and permit the contribution of a hot region to be negligible a 
posteriori in the NICER and XMM wavebands. 
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sources, ST-U performs well and is ultimately deter- 
mined to be sufficiently complex. 

We now define prior PDFs that are shared by all mod- 
els. Every hot region has one effective temperature com- 
ponent. The prior PDF of that effective temperature is 
flat in its logarithm, diffuse,!° and is separable from the 
joint prior PDF of all other model parameters. Every 
hot region also has a colatitude coordinate and an az- 
imuth coordinate (i.e., phase shift) at the center of a 
constituent spherical cap with a finite effective temper- 
ature component (Riley et al. 2019). The prior PDF of 
these coordinates is non-trivial because it is not separa- 
ble from the prior PDF of other parameters when there 
are two surface hot regions in the model, as we proceed 
to explain. 

In order to eliminate a trivial form of hot region 
exchange-degeneracy,'Ó we define the prior support by 
imposing that one hot region object in the X-PSI model 
has a center colatitude that is always less than or equal 
to the center colatitude of the companion hot region. We 
also impose that the hot regions are disjoint, meaning 
that they cannot overlap in the prior support, because 
otherwise the model cannot always be characterized by 
the number of disjoint hot regions; the non-overlapping 
condition is a function of the center colatitudes, the cen- 
ter azimuths, and the hot region angular radii, meaning 
that the joint prior is not separable, and the marginal 
prior PDFs are modulated relative to the PDFs that 
were used to construct the form of the joint PDF in 
six dimensions. For instance, the joint prior PDF of the 
azimuthal coordinates exhibits rarefaction where the az- 
imuths are close in value. 

We now construct the prior PDF specifically for the 
ST-U model (Section 3.3) that is the focus of this Let- 
ter. First, as a construction tool, define a flat PDF of 
the cosine of each hot region center colatitude, with sup- 
port being the interval cos(0) € [-1, 1]; such a PDF is 
founded on the argument of isotropy, a priori, of the 
direction to Earth (see below). Second, define the PDF 
of each hot region center azimuth!" to be flat on the 
interval 27¢@ € [0, 21] radians. Third, define a flat PDF 
of each angular radius on the interval [0, 7/2]. To calcu- 
late the marginal prior PDF of one of these parameters, 
marginalize over the other five parameters subject to the 
prior support condition of non-overlapping hot regions. 
For instance, the prior PDF of a hot region angular ra- 
dius is given by marginalizing over the angular radius 
of the companion hot region, and the hot region cen- 
ter colatitudes and azimuths. Marginalization yields a 


15 For example, with prior support bounds of 105! K and 106-8 K 
for the NSX fully-ionized hydrogen atmosphere—please see Sec- 


tion 2.3.2. 


marginal prior PDF that deviates in form from the ini- 
tially flat PDF of the parameter—the support of the 
PDF remains unchanged however. 

Our hot region models are phenomenological and to 
a degree, agnostic, despite being influenced by temper- 
ature fields implied by pulsar magnetospheric simula- 
tions. Our choice here is contrary to Riley et al. (2019), 
wherein the prior PDF of the colatitude at the center of 
a hot region in isolation—meaning one hot region on the 
surface—was flat. A flat PDF of colatitude, combined 
with a flat prior in azimuth yields an anisotropic proba- 
bility density field on the unit sphere, weighted towards 
polar regions. One reason this might be a sub-optimal 
assumption is because of X-ray selection effects: pulsed 
emission from two hot regions will exhibit a larger am- 
plitude if the hot regions are closer to the equatorial 
zone than a polar zone, for equatorial observers as is 
the case assumed for PSR J0740+6620 based on the 
NANOGrav x CHIME/Pulsar wideband radio timing 
measurements (Section 2.2.2). However, such arguments 
are tenuous, taking no heed of radio selection effects and 
magnetospheric physics, for instance. 


2.3.2. Atmosphere 


The specific intensity is determined from lookup ta- 
bles generated using the NSX atmosphere code assuming 
either fully ionized hydrogen or helium (Ho & Lai 2001) 
or partially ionized hydrogen (Ho & Heinke 2009).!? In 
this work, we consider only fully ionized models since 
the limited parameter ranges of existing opacity tables 
(Iglesias & Rogers 1996; Badnell et al. 2005; Colgan et al. 
2016) reduce the accuracy of atmosphere models em- 
ploying these tables (see Bogdanov et al. 2021 and Miller 
et al. 2021 for further discussion and comparisons). 


2.3.3. Exterior of the hot regions 


The surface exterior to the hot regions does not ex- 
plicitly radiate in any model we condition on. The sig- 
nal that would be generated by a cooler exterior sur- 
face can be robustly subsumed in the phase-invariant 
count rate terms described in Section 2.5.1, provided 
that the angular scale of the hot regions (whose images 
are explicitly integrated over) is small and that exterior 
emission is soft and thus dominated by other NICER 
backgrounds in the channels with low nominal photon 
energies. Explicitly including radiation from the exte- 
rior surface would add a very weakly informative mode 
of dependence on the mass, radius, and other parameters 
of interest, and thus the likelihood function is assumed 
insensitive to its exact treatment. Explicit treatment 
would also require handling of the ionization state of 
the cooler atmosphere. 


16 Where precisely the same hot region configuration exists twice 
in parameter space, only one of which the prior support should 
include. 

17 A periodic parameter, also referred to as cyclic or wrapped. 


18 Note that only the composition within the hot regions is explicitly 
defined for the purpose of signal generation. 
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2.4. Instrument response models 


For NICER and each of three XMM EPIC cameras we 
use a tailored ancillary response file (ARF) and redis- 
tribution matrix file (RMF) to compose an on-axis re- 
sponse matrix. For each of the XMM cameras, the ARF 
and RMF are those specific to the PSR J07404-6620 ex- 
traction regions. 

For NICER, we use the most recent calibration prod- 
ucts made available by the instrument team, namely 
the nixtiref20170601v002 RMF and the nixtiaveon- 
axis20170601v004 ARF. For the latter, the effective ar- 
eas per energy channel were rescaled by a factor of 51/52 
to account for the removal of all events from DetID 34. 
The instrument response products (RMF and ARF files) 
for the XMM pn, MOSI, and MOS2 cameras tailored to 
the PSR J07404-6620 observations were produced with 
the rmfgen and arfgen commands in SAS and products 
from the Current Calibration Files repository. 

Calibration of the performance of NICER is conducted 
mainly with observations of the Crab pulsar and nebula. 
The energy-dependent residuals in the fits to the Crab 
spectrum are generally X 2%1°. The calibration accu- 
racy for the XMM MOS and pn instruments is reported 
to be less than 396 and 2% (at 1e), respectively 2° How- 
ever, in the absence of a suitable absolute calibration 
source, the absolute energy-independent effective area 
of NICER and the three XMM detectors is uncertain at 
an estimated level of +10%. 

We define a free parameter that operates as an energy- 
independent scaling factor shared by all instruments due 
to the lack of a perfect astrophysical calibration source. 
Further, for each of NICER and XMM, we define a free 
parameter that also operates as an energy-independent 
effective area scaling. T'he overall scaling factor for each 
telescope is a coefficient of the response matrix, defined 
as the product of the shared scaling factor and the a 
priori statistically independent telescope-specific scal- 
ing factor. The overall scaling factors of different instru- 
ments are therefore correlated a priori to simulate—in 
an albeit simple way—the absolute uncertainty of X-ray 
flux calibration and the instrument-to-instrument cal- 
ibration product errors. However, the correlated prior 
PDF is such that the effective area for NICER is permit- 
ted to decrease from the nominal effective area whilst the 
effective area for XMM can increase from its respective 
nominal effective area, and vice versa. We choose the 
statistically independent telescope-specific scaling fac- 
tors to have equal spread a priori to the scaling factor 
shared by all instruments of +10%, therefore yielding a 
more conservative joint prior PDF of the overall scal- 


19 See https:/ /heasarc.gsfc.nasa.gov/docs/heasarc/caldb/nicer/ 
docs/xti/NICER-xti20200722-Release-Notesb.pdf for further 


details. 


20 See in particular Table 1 in https://xmmweb.esac.esa.int /docs/ 


documents/CAL- TN-0018.pdf. 


ing factors that we approximate as a bivariate Gaussian 
(see Table 1). We discuss posterior sensitivity to effec- 
tive area prior information in Section 4.2. 

The response models are implicitly assumed to 
accurately represent the time-averaged operation of 
the instruments during the (composite) exposures to 
PSR J07404-6620. 


2.5. Likelihood function and backgrounds 


In this section we formulate the likelihood function 
as the conditional probability of the NICER and XMM 
event data sets.?! The relative constraining power of- 
fered by the XMM likelihood function is weak, but we 
provide detail about the methodology with the overar- 
ching aim of supporting the use of imaging observations 
in ongoing and future NICER pulse-profile modeling ef- 
forts. 

'The expected photon specific flux signal generated by 
the surface hot regions is calculated as a function of rota- 
tional phase (over a single rotational cycle) and energy, 
by integrating over the photon specific intensity image 
on the sky of a distant static instrument (Bogdanov et al. 
2019b). We then assume that an adequately performing 
model of both the NICER and XMM event data sets can 
be constructed by operating on the same incident signal. 
The count number statistics for both NICER and all 
XMM cameras is Poissonian: for every detector chan- 
nel of the four instruments—and then for NICER every 
phase interval associated with a channel—the sampling 
distribution from which the registered count-number 
variate is drawn is assumed to be a Poisson distribu- 
tion with an expectation that is a function of parame- 
ters of the incident signal and parameters of the model 
instrument response. 

'The NICER event time-tagging resolution is state-of- 
the-art, and PSR J0740--6620 is sufficiently faint that 
in the absence of backgrounds, the event arrival process 
statistics would deviate in an entirely minor way from 
the incident photon Poisson point process; in reality we 
contend with background radiation, and deadtime cor- 
rections to the integrated exposure time are calculated 
during event data pre-processing. Pile-up, the register- 
ing of multiple photons as a single event during a detec- 
tor readout interval, is not an issue for XMM because 
the source is so faint. 

As we expound upon in Sections 2.5.1 and 2.5.2, the 
background event data for all four instruments—after 
implementation of filters during data pre-processing 
(Wolff et al. 2021)—is modeled with a set of count rate 


21 The domain of the likelihood function is a subset of the model 
parameter space (which is in general discrete-continuous mixed). 
The domain of the more general parameterized sampling distribu- 
tion is a subset of the Cartesian product of the model parameter 
space and the data space; the likelihood function is the function 
that this sampling distribution reduces to when the data-space 
variables become fixed by observation. 
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variables, one per detector channel. As a corollary of 
the assumptions stated above, the background processes 
contributing to these events are also assumed to be Pois- 
sonian event arrival processes. The expectations of the 
sampling distributions of the registered count-number 
variates are simply sums of the expected count num- 
bers from the PSR J0740--6620 surface hot regions and 
the expected background count numbers. Refer to Riley 
et al. (2019) for details about the NICER count-number 
sampling distribution—the form of the XMM camera 
count-number sampling distribution follows by a phase- 
averaging operation. 


2.5.1. NICER 


Let dw be the NICER count matrix over phase 
interval-channel pairs. We now define variables that 
the NICER likelihood is a function of: let s be a vec- 
tor of parameters, collected from Section 2.2 above, of 
which the incident signal generated by the pulsar is 
a function; let NICER denote a (parameterized) model 
for the response of the instrument in response to in- 
cident radiation; and let {E|bn]} be a set of statisti- 
cally independent nuisance variables, one per detector 
channel that contains event data to be modeled, de- 
fined as the phase-invariant expected count rate. The 
NICER event data is phase-resolved, so there are mul- 
tiple random variates—distributed in phase—per back- 
ground random variable. The background model with 
variables {E[byn]} is free-form as discussed by Riley et al. 
(2019). The set of these variables is designed to han- 
dle complex channel-to-channel variations in the ex- 
pected phase-invariant count rate. A physical back- 
ground model should need (far) fewer random vari- 
ables for the underlying background-generating process 
to capture these complexities. The likelihood function 
is denoted by p(dw | s, {E[bn]}, NICER). 

We numerically marginalize over the variables {E[bn]} 
to yield a marginal likelihood function that is combined 
with a joint prior PDF to define a target distribution to 
draw samples from. The count rate variables have sepa- 
rable flat prior PDFs that are strictly improper because 
we do not explicitly define upper-bounds on the prior 
support of each variable (Riley et al. 2019); the poste- 
rior is considered integrable however, so these ill-defined 
prior PDFs do not result in posterior pathologies. The 
separable prior PDF of the count rate variables is overly- 
diffuse, with extremely high prior-predictive complexity, 
such that inferences will be insensitive to minor changes 
to the function: E[bw] ~ U(0,U), where the upper- 
bound U of the prior support is left unspecified.?? 

'The marginalization operation is separable over chan- 
nels, yielding a product of one-dimensional integrals. 


We need to perform fast numerical marginalization 
over the {E[bn]} in order to compress the dimension- 
ality of the sampling space. The simpler the form 
of the integrands—the functions of the (E[bw]]—the 
more straightforward fast numerical marginalization is. 
If the prior PDF p(E[bw]) is flat, the integrand has 
a single global maximum, and would be highly Gaus- 
sian if the peak of the conditional likelihood func- 
tion p(dw|s,E[bx], NICER) lies within the support of 
p(E(bx). 

As discussed by Riley et al. (2019), a major open ques- 
tion regards the total expected spectral signal that is 
attributable to the surface hot regions in reality. In 
lieu of a physical background model—which as discussed 
above is difficult to formulate for NICER-——independent 
information conditional on data acquired with an imag- 
ing X-ray telescope such as XMM can be fundamentally 
valuable for deriving robust inferences if the exposure 
time is sufficiently long—it will however be substan- 
tially shorter than the order megasecond exposures of 
the near-dedicated NICER telescope. 

Note that in order to infer the contribution from 
contaminating sources to the NICER event data,?4 we 
would need to separate out the (IE[bx]) into an environ- 
mental background model—with some informative prior 
PDF including space weather contributions—and some 
model for the contribution from contaminating (point) 
sources in the field that cannot be resolved from the 
point-spread function (PSF) of PSR J0740--6620. How- 
ever, for the principal purpose of constraining the phys- 
ical properties of the pulsar, we are uninterested in the 
distinction between NICER backgrounds. Moreover, we 
do not have the statistical power to distinguish the con- 
tributions if the priors for the components are all rather 
diffuse. In other words, if some of the priors are informa- 
tive, we do not have the statistical power to gain much 
information a posteriori. 


— 


2.5.2. XMM-Newton 


Information about the total background contribu- 
tion to the NICER event data—including contaminat- 
ing (point) sources in the field (see Bogdanov et al. 
2019a, for a breakdown of components)—is encoded 
in the XMM spectroscopic imaging event data. Due 
to the relatively brief exposure times of the observa- 
tions, very few background counts are available for a 
reliable background estimate. Thus, we obtained rep- 
resentative background estimates with higher photon 
statistics from the blank-sky event files provided by the 
XMM-Newton Science Operations Centre”. The blank- 


23 Assuming that the pulsed emission is dominated by rotationally 
modulated emission from the surface. 
24 That is, contamination that cannot be robustly filtered out dur- 


22 The posterior should be integrable—without proof here—even if 
the prior is improper, but if an upper-bound were to be required, 
it could for instance be based on NICER count rate limitations. 


ing event data pre-processing. 


25 See https: //www.cosmos.esa.int /web/xmm-newton/blank-sky. 
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sky images were filtered in the same manner as the 
PSR J07404-6620 field images and the background was 
extracted from the same location on the detector as the 
pulsar. The resulting background spectrum was then 
rescaled so that the exposure times and BACKSCAL 
factors match those of the PSR J0740+6620 exposures. 

Leveraging spatial resolving power, we can derive pos- 
terior inferences about the signal from the pulsar in iso- 
lation with little confusion about the expected contribu- 
tions from the pulsar and background sources. When 
this information about the pulsar signal is injected into 
modeling of NICER observations—either explicitly as 
a prior or as a likelihood factor—the contribution from 
the pulsar to the NICER event data is informed; here we 
work towards a likelihood function factor for the XMM 
telescope. 

Let dx be the XMM time-integrated count vector 
(over detector channels) from a region S of a CCD 
of a camera, that is some sufficiently large subset of 
the support of the PSF of the PSR J07404-6620 whilst 
optimizing signal-to-noise. We now define variables 
that the XMM likelihood is a function of. The vec- 
tor of parameters of the incident signal generated by 
the pulsar remains as s, shared with the NICER like- 
lihood function. Let XMM denote a (parameterized) 
model for the response of an XMM camera in response 
to incident radiation. Once more, in lieu of a physi- 
cal model for the underlying background-generating pro- 
cess, let us define one statistically independent random 
nuisance variable per detector channel: the expected 
count rate. These variables are collectively denoted 
[E|bx]). The likelihood function for each XMM cam- 
era is denoted by p(dx | s, {E[bx]},xmMM). We numeri- 
cally marginalize over these model variables in the same 
vein as for the NICER background-marginalized like- 
lihood function; however, to constrain the signal from 
PSR J0740+6620 we are in need of a more informative 
prior PDF of {E[bx]}. 

To form the prior PDF of {E[bx]} for an XMM cam- 
era, we consider a set of independent Poisson-random 
variates {2x} defined as time-integrated astrophysi- 
cal (sky) background count numbers in detector chan- 
nels. The events constituting {Ax} are extracted from 
a blank-sky?? region B of the camera CCD array that is 
disjoint from region S associated with PSR J07404-6620. 
The XMM cameras are composed of multiple side-by- 


26 There remains confusion, however, about what contribution from 
the pulsar and its near vicinity is generated by surface hot re- 
gions. 

27 Imaging observations are configured such that the target point 
source is confined to a single CCD, for instance to avoid masking 
part of the source PSF with gaps between CCDs. 


28 Where there is now one variate per channel because the count 
numbers are phase-averaged. 

29 That is, a region of sky devoid of any (bright) non-diffuse X-ray 
sources. 


side CCD detectors. For each camera, it is preferable 
to choose the region 6 on the same detector as S be- 
cause the different CCD detectors have slightly differ- 
ent responses to incident radiation and therefore exhibit 
slightly different astrophysical (sky) backgrounds. For 
blank-sky exposures, the conditional sampling distribu- 
tion in the space of the data is p(( x) | {E[Ax]}), where 
the variables {E[@x]} are per-channel expected count 
numbers. 

To constrain the background in the XMM images of 
PSR J0740+6620, we use blank-sky estimates generated 
using the XMM-Newton SAS tools. The blank-sky ex- 
posures are much longer in duration than the exposures 
to PSR J07404-6620 by almost two orders of magni- 
tude, allowing us to constrain the astrophysical (sky) 
background count rate in the on-source exposures more 
tightly (in the absence of systematic error) than possible 
from blank-sky extraction regions from the same CCD 
during the shorter on-source exposure. For each XMM 
camera the respective region B is not only localized to 
the same CCD as the PSR J07404-6620 PSF, but is the 
same region of the CCD. However, the blank-sky esti- 
mates are based on an ensemble of pointings over the 
sky. Our cross-check of the sky background in these 
reference exposures against the sky background in the 
vicinity of PSR J07404-6620 did not yield evidence of 
systematic difference. 

We now formally derive the constraints on the ex- 
pected number of background counts per detector chan- 
nel registered within the source region S of an XMM 
camera, {2x}, conditional on the counts registered in 
region 6 over the ensemble of blank-sky exposures. A 
prior PDF of the variables (IE[Z7x]) is needed. In the 
same vein as for the NICER background variables, we 
define a separable prior PDF 


{| Ax] od U (0, Y), (1) 


where the upper-bound ¥Y of the prior support is left 
unspecified. This trivial model over-fits the background 
data, with posterior PDF 


PUE[2x]} | Ux 3) x p( (x |{E[Ax]}); (2) 


the vector of random variates {Ax} is coincident in 
value with the maximum a posteriori vector in the space 
of (E[Zx]]- 

As is the case for the NICER background marginal- 
ization operation described in Section 2.5.1, if the PDF 
p((E[Z2x]) | [22x )) is flat, the marginalization is more 
straightforward to execute. For example, we could 
form a flat prior PDF spanning the highest-density 
xr% posterior credible interval in E[bw], given {Ax}. 
We form a flat prior PDF in a simpler way. The 
PDF p((E[Zx]) | {Ax}) has the approximate structure 
of a truncated Gaussian with deviation dependent on 
the number of counts yxy. We let the lower- and 
upper-bounds of the support respectively be Z :— 
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Figure 4. The black step function is the XMM pn camera 
PSR J0740+6620 count number spectrum as a function of 
detector channel nominal photon energy. The XMM events 
contain source events and events from diffuse background 
that can be estimated from blank-sky information. The 
NICER count number spectrum is the red solid step function, 
including PSR J07404-6620 events and all backgrounds; the 
red dashed step function is the empirical pulsed count num- 
ber in each channel. The XMM pn events are clearly sparse 
in comparison to the NICER event data, but note that the 
number of pulsed NICER counts is lower than indicated here, 
as shown in Figure 1. The blue step function is the count 
number spectrum derived from blank-sky exposure, scaled 
down to the exposure time on PSR J07404-6620 and also 
scaled for the CCD extraction region area ratio As/Ag as 
described in the main text—please see Equation (3). The 
orange shaded region is the support of the joint prior PDF 
of the expected count rate variables {E[bx]}, derived from 
the blank-sky exposures. The figure elements rendered here 
are representative of the corresponding information from the 
XMM MOS1 and MOS2 cameras, as can be seen in the on- 
line figure set associated with Figure 16. 


max (0, Zx — ny Zy) and V := Bx +nV/#x, where 
n is a setting that controls the degree of conservatism. 
For some channels the number of counts is low and the 
PDF p((E[Zx]) | {2x}) deviates substantially from be- 
ing a truncated Gaussian, but we nevertheless adopt the 


30 'This procedure is sufficient for the channel cuts we make because 
there is always a higher channel with a finite number of counts 
Bx. 

31 Note that the binary companion of PSR. JO740+6620 has been in- 
ferred to be an ultra-cool white dwarf (Beronya et al. 2019) whose 
thermal surface emission would not contribute X-ray events, un- 
less there is some interaction with higher-energy winds from 
PSR J0740+6620. 


same procedure—the lower-bound is pushed to zero or 
far into the lower-tail of the distribution, but the upper- 
bound remains sensible. For some channels, however, 
By = 0. In these cases, we simply set -Z := 0 and then 
iterate upwards in channel number to locate the next 
finite value of Zx + n/Zx.?? We choose n = 4. With 
the flat PDF of EE[Z/x] defined, we transform variables 
to derive the prior PDF p(E[bx] | [Zix )) according to 


its) = = ie (3) 


where Tg is the blank-sky exposure time needed to 
transform to a count rate, and As/Ag is the ratio of 
the areas of the extraction regions S (encompassing the 
PSR J0740+6620 PSF) and region B. 

In many respects this flat PDF of E[@x] is a re- 
markably conservative choice for the prior constraint on 
p(E[bx]| {Ax}). A reason it is not conservative is the 
assumption of zero prior mass above an upper count-rate 
limit Y somewhere in the upper-tail of the true prob- 
ability PDF p((E[Zx]M | {4x}), especially because the 
count-number data dx for each XMM camera are mod- 
erately consistent with being generated by background 
processes. The upper-limit % can be decreased so that 
p(E[bx] | {4x}) is more informative at the risk of bias. 
On the other hand, Y can be increased, which natu- 
rally weakens the constraining power but can be jus- 
tified as a safety precaution to capture a contribution 
such as a power-law component originating from the 
magnetosphere of PSR J07404-6620 that is not explic- 
itly modeled.?! Alternatively, we could justify a higher 
upper-limit in terms of systematic error due to blank- 
sky estimates derived from an ensemble of exposures 
over the sky and variation in time of the XMM camera 
response matrices between the PSR J07404-6620 expo- 
sure and blank-sky exposure ensemble. The setting of 
n = 4 seems like a reasonable—albeit arbitrary—choice 
to balance information loss versus bias; we probe poste- 
rior sensitivity to this hyperparameter and conclude our 
inferences are insensitive to its value (see Section 3). We 
display the XMM pn camera count number spectrum 
in Figure 4 together with the blank-sky derived prior 
PDF of the expected count rate variables [E[ZZx]Y; we 
also display the NICER count number spectrum for data 
quality comparison. 


— 


2.5.3. The likelihood function 


The likelihood function given the NICER and XMM 
data sets may be written as 
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p(dn, dx, {2x} | s, {Elbn]}, (E[bx])) = p(dn | s, { 


where we combine the expected background count rate 
variables over the XMM cameras into {E[bx]}, we com- 
bine the count numbers in the PSR J07404-6620 ex- 
posures into dx, and we combine the blank-sky count 
numbers into {4x}. Introducing the flat prior densities 


(uy 


p(dn, dx, {2x} | s) x 
{0} 


The background-marginalized likelihood function is fed 
as a callback to a sampling process, together with a joint 
prior PDF callback for the pulsar signal parameters s 
and parameters associated with the NICER and XMM in- 
strument response models. 


2.6. Model space summary 


All nodes in the model space share some underlying 
physics. Namely, the machinery for relativistic ray- 
tracing: an oblate surface is embedded in an ambient 
Schwarzschild spacetime, and the X-ray emission emer- 
gent from the atmosphere is attenuated by the inter- 
stellar medium as it is transported to a distant static 
telescope. The nodes in the model space differ first and 
foremost in their surface hot region parameterization 
complexities and atmosphere composition flag values, 
but also in terms of the prior PDF and the likelihood 
function factors. The prior PDFs for the parameters 
controlling the shared processes (i.e., mass, equatorial 
radius, viewing angle to the spin axis, distance, col- 
umn density) are either informative—such as the joint 
NANOGrav and CHIME/Pulsar measurement of mass, 
distance, and viewing angle—or diffuse if there is lim- 
ited prior knowledge or we aim to probe the consistency 
of likelihood function factors across telescopes. 


2.7. Posterior computation 


We implement nested sampling to compute the pos- 
terior distribution conditional on each model. Namely, 
we use X-PSI to construct the likelihood function and 
the prior PDFs and then couple them to MULTINEST 
(Feroz & Hobson 2008; Feroz et al. 2009; Buchner et al. 
2014). For the headline model we report in this Letter, 
including NICER and XMM likelihood factors, the di- 
mensionality of the sampling space is 15. Details about 
our nested sampling protocol are given in Riley et al. 
(2019, see the appendix matter in particular) and also 
in Riley (2019, see chapter 3 and the associated ap- 
pendix). In summary, our minimum resolution settings 
are as follows: 10? live points; a bounding hypervolume 


li p(dw | s, (Elbx]), NICER)d[E[bw]) 


[bx], NICER) p(dx | 8, (E[bx]]; XMM)p({ Ax} | {E[bx]}), — (4) 


from Equation (1), approximating the posterior PDF 
p({E[|Ax]} |{Ax}) as flat and bounded as described 
in Section 2.5.2, and marginalizing over all expected 
background count rate variables yields the background- 
marginalized likelihood function 


—— 


{%} 
I p(dx | s, {E[bx]}, XM M)d[ E|Zx]J. (5) 
{2} 


expansion factor of 0.1~!; and an estimated remaining 
log-evidence of 107}. Regarding live points, most poste- 
riors for sensitivity analyses are computed with 2 x 10? 
or 4 x 10? live points, and production calculations used 
4 x 10* live points. The number of live points is the 
most fundamental parameter that should be changed 
to probe sampling resolution sensitivity—the expansion 
factor can be fixed at some value similar to those recom- 
mended in the literature (Feroz et al. 2009). Likelihood 
function evaluation time is the dominant sink, being 
several seconds per core for the processor speeds typ- 
ical on a cluster or supercomputer. We do not use the 
constant efficiency algorithm variant for any sampling 
process, and we do not use the mode-separation algo- 
rithm variant unless stated otherwise. Regarding the 
mode-separation variant, if there are multiple modes of 
commensurate posterior mass, sampling resolution gets 
distributed between those modes. It follows that the 
bounding approximation to the constant likelihood hy- 
persurfaces in each mode is lower than if the global reso- 
lution settings were consumed solely by that mode. We 
eliminate hot region exchange degeneracy from the prior 
support as discussed in Section 2.3.1, which eliminates 
mirrored modes and thus improves the resolution of that 
mode in terms of bounding approximation error. 

For most posteriors reported in this work, the nested 
samples are considered high-resolution in the context 
of literature recommendations and, for the production 
analysis, were costly to generate given resource limita- 
tions. It is important to remark that our posterior com- 
putation procedure has not been validated in any mean- 
ingful way via simulation-based calibration because at 
present it is basically intractable for any one group to 
calibrate credible region coverage on a model-by-model 
basis (see the discussion in chapter 3 of Riley 2019 and in 
Riley et al. 2019). For discussion on the level of calibra- 


32 For additional details about these variants, refer to Riley et al. 
(2019) and references therein. We also do not use the importance 
sampling algorithm variant (Feroz et al. 2013). 
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tion we have attained by cross-checking against indepen- 
dent calculations performed by another group (Miller 
et al. 2019), we refer to Bogdanov et al. (2021). How- 
ever, we open-source the entire analysis package for this 
Letter, so another group with resources is free to modify, 
cross-check, and improve upon the posterior computa- 
tion. 


3. INFERENCES 


In this section we report our inferences. We first sum- 
marize the measures used to assess model performance. 
We then discuss an exploratory analysis that examined 
sensitivity to resolution settings and selected model as- 
sumptions, in particular the effects of different assumed 
atmospheric composition and hot region configuration. 
We then report high-resolution posterior inferences for 
the superior model. 


3.1. Performance measures 


For pulse-profile modeling with X-PSI a set of perfor- 
mance measures should be considered for each model in 
the model space, largely following the protocol of Riley 
et al. (2019). The first measure, given a set of poste- 
rior samples, is graphical and the most practical: basic 
posterior-checking by inspecting for inconsistency be- 
tween the statistically independent NICER count num- 
ber variates and the separable sampling distribution 
from which those variates are assumed to be drawn a 
posteriori. We estimate the expectation with respect 
to the posterior of the expected count numbers and 
form Poisson sampling distributions from these expected 
count numbers. If there is clear structural difference— 
namely residual correlations in the joint space of en- 
ergy and phase—then the model cannot generate data 
with the structure of the real count numbers. Suppos- 
ing there are no discernable correlations, then because 
the sampling distribution for each variate is Poissonian 
with a sufficiently large expectation for the distribution 
to be well-approximated as Gaussian, we can inspect 
the distribution of the standardized residuals to identify 
any clear deviation from a normal distribution—e.g., too 
much or too little weight in the tails—that would be in- 
dicative of noise-model inaccuracies. If this check also 
passes, then the model has sufficient complexity to gen- 
erate synthetic data with the structure of the NICER 
PSR J0740--6620 event data and is adequate for simu- 
lation purposes—e.g., for statistical forecasts of the con- 
straining power achievable with future X-ray space tele- 
scope concepts such as the enhanced X-ray Timing and 
Polarimetry mission (eXTP; Zhang et al. 2019; Watts 
et al. 2019) and the Spectroscopic Time-Resolving Ob- 
servatory for Broadband Energy X-rays (STROBE-X; 
Ray et al. 2019). 

'The second measure we inspect is the maximum like- 
lihood estimate reported by a nested sampling process. 
Note that nested sampling does not target maximum 
likelihood estimation, but the drawing of samples from 


the typical set of a target distribution—the maximum 
likelihood estimate is therefore also subject to the con- 
centration of prior mass in parameter space. More 
specifically, we are working with the estimated maxi- 
mum of a background-marginalized likelihood function 
given by Equation (5) or one of the likelihood factors 
(i.e., the NICER or XMM likelihood function). We 
can use these point estimates to compare, in a simple 
way, models of the same count number variates. We 
only graduate to comparison of models using maximum 
likelihood estimates if the graphical posterior checking 
described above does not reveal failures. The model 
that reports the highest maximum likelihood estimate 
amongst those that model the same count number vari- 
ates has a sampling distribution? that captures the 
most structure in the set of count number variates. It is 
plausible, therefore, that a data-generating process de- 
fined by that model is the closest approximation of phys- 
ical reality attained by the models considered. For mod- 
els that can a posteriori generate data with the structure 
of the real count number variates, the maximum likeli- 
hood estimate can be used to resolve small differences 
that human inspection fails to uncover. 

'The third measure that we examine when comparing 
models of the same set of count number variates is the 
evidence (the prior predictive probability distribution 
evaluated using the real count number variates). Whilst 
maximum likelihood estimates are mere point estimates, 
the evidence is the expectation of the likelihood function 
with respect to the joint prior PDF. In one respect, this 
is powerful because unwarranted prior predictive com- 
plexity is penalized: if too much complexity is added 
to model M to construct model Mt, then for Mt the 
likelihood function over a large swathe of prior mass is 
smaller than the expected likelihood (with respect to the 
prior) of model M, which can entirely negate any local- 
ized increases in the likelihood function. In other words, 
much of the additional complexity is unhelpful because 
the data generated does not have a similar structure to 
the real data. On the other hand, penalizing complexity 
in this way is arguably misleading if the model has phe- 
nomenological components: in this Letter the surface 
hot region models are phenomenological. 

'The evidence, together with a prior mass function of 
nodes of the model space, may be a biased model selector 
in our context. Unfortunately, it is necessary in a formal 
Bayesian framework to use the evidence to marginalize 
over nodes of the model space in order to compute a pos- 
terior PDF of parameters of interest that are shared be- 
tween nodes. Marginalizing over nodes that differ solely 
by the atmosphere composition is not problematic. If 
we do not formally marginalize in such a manner over 
all models, however, then we can only report the pos- 


33 Within the continuous set of such distributions associated with 


the model. 
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terior distribution (marginalized over atmosphere com- 
position) for each hot region model that satisfies the 
graphical posterior checking criterion, together with the 
maximum likelihood estimate. The reader is then free 
to interpret the model-to-model posterior variation as a 
systematic error estimate by, for instance, weighting the 
posteriors equally which would roughly lead to credible 
regions with near-maximum hypervolume (width in one 
dimension, area in two dimensions, and so on); formally, 
this is equivalent to defining an implicit prior mass dis- 
tribution over model space nodes that happens to nullify 
evidence differences, leading to a uniform posterior mass 
function of models. Alternatively, any other weight- 
ing can be interpreted as the reader choosing their own 
prior mass function of model space nodes. 

Lastly, when comparing models of the same set of 
count number variates, we also consider the tractability 
of the model. If two models pass graphical posterior pre- 
dictive checks, and supposing one model is less complex, 
that model is almost by definition more straightforward 
to implement and to compute the posterior for. The ade- 
quately performing model that requires fewest resources 
to reproduce or prove erroneous—thereby increasing the 
robustness and potentially the computation accuracy— 
can be reasoned to be the most useful in practice. 


3.2. Exploratory analysis 


In this section we report on posterior sensitivity to var- 
ious features of the analysis pipeline. Although one can 
attempt to probe sensitivity using importance sampling, 
we opted for nested sampling for every variant of inter- 
est. In our sensitivity analyses, our nested sampling res- 
olution settings are lower than for the production analy- 
sis because we were ultimately resource-limited. We var- 
ied the number of nested sampling live-points and the 
bounding hypervolume expansion factor; we explored 
XMM background prior hyperparameter variation; we 
switched the atmosphere composition from hydrogen to 
helium; and we varied likelihood function resolution set- 
tings. We have not probed sensitivity to event data set 
selection (namely, NICER detector channel cuts) nor 
sensitivity to approximation of the atmosphere ioniza- 
tion state as fully-ionized. 

The posteriors we report in this section were com- 
puted using at least 2 x 10° live points and (except 
where explicitly noted) condition on the ST-U model 
and either the NICER likelihood function or the NICER 
and XMM likelihood function. For a full description of 
this model, and a schematic diagram, see Riley et al. 
(2019). Briefly, however, ST-U assumes each hot region 
is a single-temperature spherical cap. The two regions 
can have completely independent properties (tempera- 
ture and size) and are free to take any location on the 


34 And more formally still, this would mean the prior mass function 


is dependent on the data, which is a fallacy. 


star’s surface provided that they do not overlap (see 
also the discussion in Section 2.3.1). Our exploratory 
analysis indicated that this model provided an adequate 
description of the PSR J07404-6620 data set using the 
performance measures outlined in Section 3.1. Finally, 
note that posteriors reported in this section are condi- 
tional on a NICER exposure time that was erroneously 
high by ~ 2%. We corrected this number for a subset of 
posteriors reported in this section, and for the produc- 
tion analysis (Section 3.3); our posteriors are however 
insensitive to this level of error in exposure time. 


3.2.1. Impact of radio timing prior information 


The informative joint NANOGrav and 
CHIME/Pulsar prior is critical for deriving a useful 
constraint on the radius of PSR J0740+6620. For 
comparison, we compute a posterior conditional on a 
fully-ionized hydrogen atmosphere and a diffuse, sep- 
arable prior PDF of the mass, the distance, and the 
cosine of inclination angle. The mass prior is such that 
the joint prior PDF of mass and radius is flat within the 
prior support (see Table 1). The distance prior PDF 
is adopted from Igoshev et al. (2016) and displayed in 
Figure 2, with support D € [0.1,10.0] kpc. The prior 
PDF of the cosine of the inclination angle is isotropic, 
meaning flat with support cos i € [0, 1]. 

Fig. Set 5. Exploratory analysis. 

'The radius posterior conditional on the diffuse prior 
is much broader than when we condition on the joint 
NANOGrav and CHIME/Pulsar prior, and there is no 
independent indication from the pulse-profile modeling 
for a high mass (see Figure 5). Once the models are con- 
ditioned on the joint NANOGrav and CHIME/Pulsar 
prior PDF, we gain very little additional information a 
posteriori from the X-ray likelihood function about the 
pulsar mass, distance, and inclination angle with respect 
to the spin axis. We can verify this by examining the 
marginal posterior distributions in comparison to the 
respective prior distributions, which is summarized for 
each parameter by the Kullback-Leibler divergence esti- 
mate (Kullback & Leibler 1951). 


3.2.2. Impact of X-ray telescopes 


'The NICER likelihood function is sensitive to the ba- 
sic configuration of the surface hot regions and their 
temperatures, despite the relatively small number of 
pulsed counts (those above the phase-invariant back- 
ground). The XMM likelihood function is less infor- 
mative both due to the lack of phase information, and 
because the events are sparse for all three EPIC cam- 
eras and moderately consistent with the expected back- 
ground signal derived from blank-sky exposures. How- 
ever, the XMM likelihood function acts to reduce the 
NICER posterior mode volume substantially, affecting 
the inferred radius and geometry. The reason for this 
is because the XMM likelihood is sensitive to the com- 
bined phase-averaged signal from the hot regions be- 
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Figure 5. One- and two-dimensional marginal PDFs conditional on the ST-U model, the NICER likelihood function alone, 
and one of three prior PDFs to probe the impact of radio timing information. From leftmost to rightmost in each panel, the 
parameters are the equatorial radius, the gravitational mass, the cosine of viewing angle subtended to pulsar spin axis, the 
distance, and the column density. We display the marginal prior PDFs for each parameter as the dash-dot functions; the 
informative priors encode the information from NANOGrav x CHIME/Pulsar and Cromartie et al. (2020, denoted by the 
conditional argument C+20), and the diffuse prior is described in Section 3.2.1. We report estimators for the NICER posterior 
conditional on the joint NANOGrav and CHIME/Pulsar prior. We report the KL-divergence, Dxr, from prior to posterior in 
bits for each parameter. The shaded credible intervals Cles% for each parameter are symmetric in marginal posterior mass about 
the median, containing 68.396 of the mass. The credible regions in the off-diagonal panels, on the other hand, are uniquely the 
highest-density—and thus the smallest possible—credible regions, containing 68.396, 95.496, and 99.796 of the posterior mass. 
In the appendix of Riley et al. (2019) we provide additional information regarding posterior kernel density estimation (KDE), 
error analysis, and the estimators displayed here; note that here we use an automated Gaussian KDE bandwith optimized by 
GetDist (Lewis 2019). The complete figure set for the exploratory analysis (7 images) is available in the online journal. 
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ing too bright. "Therefore, given the NICER likelihood 
function, we constrain the contribution to the unpulsed 
portion of the pulse-profile that must be generated by 
the hot regions rather than the backgrounds. Posterior 
figures demonstrating this are reserved for Section 3.3. 


3.2.3. Effect of atmospheric composition 


The atmospheric composition of PSR J0740+6620 is 
not known a priori. We therefore compared ST-U poste- 
riors for hydrogen and helium atmospheres assuming full 
ionization—see the discussion in Section 2.3.2—in the 
second online figure of the set associated with Figure 5. 
The marginal radius posteriors were indistinguishable, 
although there were some small changes in the proper- 
ties of the hot regions. However, given the apparent lack 
of sensitivity to atmospheric composition, inferences re- 
ported hereafter are conditioned on a fully-ionized hy- 
drogen atmosphere—we do not need to marginalize over 
the binary atmosphere parameter. Both hot regions are 
inferred to have effective temperatures T ~ 10° K, at 
which partial ionization effects should be small. 


3.2.4. Hot region complexity 


The Riley et al. (2019) analysis of PSR J0030--0451 
reported a number of hot region models that provided 
an adequate description of the data according to their 
performance measures (largely adopted here). These in- 
cluded ST-U and variants in which one of the hot re- 
gions was permitted increasingly complex forms, includ- 
ing rings and crescents. The inferred radius changed 
as model complexity increased, but evidence calcula- 
tions showed a substantial improvement in model per- 
formance. As a result of this, we deemed the ST+PST 
model—in which one hot region is a single temperature 
spherical cap and the other is, a posteriori, a crescent— 
to be superior for PSR J00304-0451. 

For PSR J0740+6620 the ST-U model also provides 
an adequate description of the data. In order to assess 
whether additional complexity is useful we also condi- 
tion on the ST+PST model. The posterior configuration 
and properties of the hot regions conditional on this 
more complex model (which includes the possibility of 
hot regions that are simply spherical caps) did not dif- 
fer in an important way from the configuration inferred 
from ST-U: the hot region for which more complexity 
was permissible exhibited degeneracy a posteriori—we 
were not sensitive to the existence of additional emission 
structure beyond that of a simple spherical cap, and the 
evidence estimates are consistent. There was therefore 
no extended crescent as inferred for PSR J0030--0451; 
the likelihood function degeneracy included some subset 
of possible crescent structures—those on smaller angu- 
lar scales (see Riley et al. 2019, for discussion about hot 
region structure degeneracy)—which may be of interest 
to pulsar modelers. The inferred radius changed very 
little (see the third online figure of the set associated 


with Figure 5), and there was no increase in model per- 
formance. For this reason we hereafter report inferences 
exclusively for the ST-U model. 


3.2.5. XMM-Newton background prior sensitivity 


As described in Section 2.5.2, the XMM background 
is free-form, but each variable (one per channel) has a 
prior with compact support. For each variable, a flat 
prior PDF is defined whose width is controlled by a hy- 
perparameter n. For the headline inferences reported in 
this Letter we used n — 4, having tested sensitivity to 
varying n in the range n € [0.01,8]. In the limit that 
n tends to zero, the background information would be 
treated as a point estimate of the XMM background. 
'The posterior distribution of the radius is insensitive to 
n being varied through the range n € [0.01,4]; see the 
fourth online figure of the set associated with Figure 5. 
It broadens slightly for n — 8 because fainter combined 
signals from the hot regions have greater background- 
marginalized likelihoods, yielding additional posterior 
weight for higher-radius configurations that reduce the 
unpulsed component whilst conserving the pulsed com- 
ponent to satisfy the NICER event data. However, the 
value n — 8 is arguably too conservative even when con- 
sidering potential systematic error. 


3.2.6. Likelihood function resolution sensitivity 


The X-PSI likelihood function has a number of res- 
olution settings, most notably settings that control the 
discretization of the computational domain for compu- 
tation of signals (pulse-profiles) incident on telescopes. 
'The photon specific flux signal we require is an integral 
over a distant observer's sky of the photon specific in- 
tensity from the hot regions, yielding a two-dimensional 
function of time (rotational phase) and photon energy. 
'The level of discretization with respect to four variables 
in the domain of the incident photon specific intensity 
field generally controls the computational expense of 
likelihood evaluation. These variables are the number of 
rotational phases and energies the photon specific flux 
signal is computed at; the number of hot region surface 
elements; and the number of rays calculated. Please see 
the X-PSI documentation?? for additional information. 

We tested posterior sensitivity to increasing the dis- 
cretization degrees for these variables by recomputing a 
posterior PDF with a new nested sampling process. We 
found that doubling these discretization degrees does 
not yield a change in the posterior PDF that is clearly 
distinguishable from Monte Carlo sampling noise;?? see 
the fifth online figure of the set associated with Figure 5. 


35 https:/ /thomasedwardriley.github.io/xpsi/ 


36 The nested sampling seed was set based on the system clock for 
each sampling process and therefore not held constant as would 
be ideal. 
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3.2.7. Nested sampling resolution sensitivity 


For a fixed bounding hypervolume expansion factor of 
0.1-!, the posterior PDFs were insensitive to doubling 
live-point number from 10? up to 4 x 10°. Following 
sampler comparisons within the NICER collaboration, 
we then increased resolution to 4 x 104 live points, lead- 
ing to broadening of the radius posterior; see the sixth 
and seventh online figures in the set associated with 
Figure 5. Increasing the sampling resolution by using 
8 x 10^ live points led to some further broadening, but 
doubled an already large computational cost. Given the 
computational resources required for posterior compu- 
tation with such a large number of live points, we were 
not able to rigorously prove convergence with live-point 
number. We decided to adopt 4 x 104 live points for the 
production analysis—all information necessary to repro- 
duce and improve upon our posterior computation is 
available in open-source repositories. 

Such posterior mode broadening with increased nested 
sampling resolution is naturally expected because nested 
sampling algorithms approximate hypersurfaces in pa- 
rameter space of constant likelihood; these approxima- 
tions improve with sampling resolution but their suffi- 
ciency is difficult to prove for non-trivial likelihood func- 
tions encountered in real problems and when subject to 
resource limitations. It is desirable to transform away 
non-linear modal degeneracies so that an approximation 
conforms more efficiently to structure in the sampling 
space; however, this can in practice be intractable task 
for a given problem. Moreover, when sampling from 
a target distribution with two or more modes of com- 
mensurate posterior mass, the live-point resolution is 
roughly split between the modes, reducing the resolu- 
tion of a given mode due to the approximations alluded 
to above. For PSR J0740--6620, posterior bimodal- 
ity arises due to the near-equatorial inclination of the 
source, leading to two competitive geometric configura- 
tions of the hot regions (see Section 3.3, where we discuss 
this further). 


3.3. Production analysis 


Our exploratory analysis indicates that model ST-U 
provides an adequate description of the data and that 
the posteriors are largely insensitive to either atmo- 
spheric composition or increased hot region complexity. 
In this section, we present high-resolution posteriors— 
using 4 x 10* live points—conditional on the ST-U model, 
and fully-ionized hydrogen atmosphere. For each poste- 
rior we use either the NICER likelihood function alone, 
the NICER and XMM likelihood function, or the XMM 
likelihood function alone. The posterior PDF condi- 
tional on the NICER likelihood function alone is derived 
by importance-sampling another posterior PDF, thereby 
updating a deprecated radio timing prior PDF (see Sec- 
tion 2.2.2 and Fonseca et al. 2021a). The weighted and 
equally-weighted samples from the marginal joint poste- 
rior distribution of mass and radius may be found in the 


persistent repository of Riley et al. (2021), together with 
credible region contour point-sequences and marginal 
posterior PDFs of the radius (to facilitate plotting). 

Figure 6 provides a simple graphical posterior pre- 
dictive check on the model performance, demonstrat- 
ing that the ST-U model can generate synthetic event 
data that is commensurate with the NICER data. 
No unexpected structure—such as large deviations or 
correlations—is emergent in the residuals. 

Marginal posterior distributions of the spacetime 
parameters—in particular the radius—are shown in Fig- 
ure 7. The figure displays posteriors conditional on the 
NICER data alone, conditional on the XMM data alone, 
and conditional on both NICER and XMM data. As ex- 
pected, the phase-resolved NICER likelihood function 
is far more constraining, in isolation, than the phase- 
averaged XMM likelihood function. However, the likeli- 
hood function product over telescopes excludes regions 
of the NICER posterior modes where the contribution 
to the unpulsed component of the pulse-profile from the 
hot regions is too bright. The unpulsed component is 
brighter for models in the NICER posterior modes where 
the star is more compact. Restricting to lower compact- 
ness increases the inferred radius for the combined data 
set by ~ 1 km. 

The inferred hot region parameters, again comparing 
the likelihood functions in isolation to the likelihood 
function, are shown in Figure 8. The effect of includ- 
ing the XMM data can be seen in the joint posterior 
PDF of the stellar radius and the angular radii of the 
two hot regions (p and Çs). The NICER-only posteri- 
ors (at the 99.7% level) include models with a smaller 
stellar radius (< 10.5 km) and hot regions with a larger 
angular radius (¢ ~ 1.2 rad). The large hot regions on 
very compact stars lead to a bright unpulsed component 
of the combined signal from those regions. The inclusion 
of the XMM data means that more of the unpulsed sig- 
nal is associated with the background instead of the hot 
regions. As a result, these smaller stars with large hot 
regions are excluded when the XMM data is included. 

Interestingly there are two different posterior modes 
(due to the near-equatorial inclination),"" which can be 
seen in more detail in the phase-averaged skymaps in 
Figure 9 and the animated skymap in Figure 10. For 
neither mode are the two hot regions antipodal. The 
effect on the inferred signal (pulse-profile and phase- 
averaged spectrum) of combining the two data sets is 
shown in Figures 11 and 12, respectively. The inclusion 
of the XMM data reduces the contribution of the hot 
region emission to the unpulsed component of the pulse 
profile, leading to a lower count rate but an increased 


37 Note that these are different geometric configurations because we 
expressly exclude hot region exchange degeneracy from the prior 


support (see Section 2.3.1). 
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Figure 6. NICER count data {dix}, posterior-expected count numbers {cik}, and (Poisson) residuals for ST-U. Note that 
we split the count numbers in the upper two panels over two rotational cycles, such that the information on phase interval 
$ € [0,1] is identical to the information on $ € (0,2]; our data sampling distribution, however, is defined as the (conditional) 
joint probability of all event data grouped into phase intervals on ¢ € [0,1]. We display the standardized (Poisson) residuals 
in the bottom panel: the residuals for the rotational cycle ¢ € [0,1] were calculated in terms of all event data on that interval 
(as for likelihood definition), and simply cloned onto the interval ¢ € (1,2]. In Section 3.1 and in the appendix of Riley et al. 


(2019) we elaborate on the information displayed here. 


pulsation amplitude (see lower panels of Figure 11) in 
the combined signal from the hot regions. 


4. DISCUSSION 
4.1. Radius measurement and implications for EOS 


'The inferred equatorial radius of the massive pulsar 
PSR J0740--6620 is Reg = 12.391139 km, where the 
credible interval bounds are approximately the 16% and 


84% quantiles in marginal posterior mass, given rela- 
tive to the median. The inferred mass, 2.0727} 064 Mo 
is dominated by the mass prior from the radio timing, 
2.08 + 0.07 Mo. The 90% credible interval for the ra- 
dius is 12.39* 222 km and the 95% credible interval is 
12.39*702 km. It is worth stressing that when we car- 
ried out pulse-profile modeling without using the mass 
prior from radio timing, we would not have inferred in- 


Ny [102° cm7?] 


co 


© 


A 


N 


A NICER VIEW OF THE MASSIVE PULSAR PSR J0740+6620 21 


Cles% = 12.305155 
Dk = 0.58 


Cles% = 2.072*0 007 
Dg = 0.01 


XMM « ST-U + FI-H 


NICER «€ ST-U + FI-H 
NICER x XMM « ST-U + FI-H 


Cles% = 0.0424*9-0029 
Dg = 0.01 


Clea% = 1277033 
Dy, = 0.10 


Cles% = 15871123 
Dy, = 0.91 


Eu d» X g? de o? o9 49 AY AP A Lb MS 
Q oF OF ow o Ny [102° cm7?] 
: D [kpc] 
cos(i) 


Figure 7. One- and two-dimensional marginal PDFs for fundamental parameters, conditional on the ST-U model and either 
the XMM likelihood function alone, the NICER likelihood function alone, or the NICER and XMM likelihood function. From 
leftmost to rightmost in each panel, the parameters are the equatorial radius, the gravitational mass, the cosine of viewing angle 


subtended to pulsar spin axis, the distance, and the column density. We display the marginal prior PDFs for each parameter 
as the dash-dot functions; the informative priors encode the NANOGrav x CHIME/Pulsar information. We report estimators 
for the NICER x XMM posterior. We use an automated Gaussian KDE bandwith optimized by GetDist (Lewis 2019). See the 
caption of Figure 5 for additional details about the figure elements. 


dependently from the NICER and XMM data alone that 
PSR J0740--6620 is a high-mass source (nor did we ob- 


tain any informative constraint on the radius; see Sec- 
tion 3.2.1). 
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Figure 8. One- and two-dimensional marginal posterior distributions of hot region parameters conditional on the ST-U model. 
Three types of posterior distribution are rendered: one conditional only on the NICER likelihood function; one conditional only 
on the XMM likelihood function; and one conditional on the NICER and XMM likelihood function. 


Rotation increases the equatorial radius of a neutron 
star. The increase in radius for fixed mass is small, as 
shown in Figure 13, where mass-radius curves are plot- 
ted for four representative EOS that span a wide range 
of allowed stiffness (Hebeler et al. 2013). Mass-radius 
curves for non-rotating stars and stars rotating at 346 Hz 
are shown for each EOS. For a 2.0 Mo star, the equato- 


rial radius increases by slightly more than 0.05 km for 
the softest EOS, while the increase is as large as 0.2 km 
for the stiffest EOS. These increases in radius due to 
spin are smaller than our uncertainty in measuring the 
neutron star's radius at present. The change in radius 
due to spin is already incorporated into our pulse-profile 
models, since we assume the shape of the rotating star 
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Figure 9. A novel type of figure rendering the phase-averaged expected (photon) specific intensity as a function of sky direction 
for the source-receiver configuration estimated to maximize the (background-marginalized) ST-U likelihood function given by 
Equation (5), showing the two different posterior modes and the effect of XMM likelihood function inclusion. Top-left set of four 
panels: NICER likelihood function—mode one. Bottom-left set of four panels: NICER likelihood function—mode two. Top- 
right set of four panels: product of NICER and XMM likelihood functions—mode one. Bottom-right set of four panels: product 
of NICER and XMM likelihood functions—mode two. The expected photon specific flux spectrum registered by NICER if we 
phase-average, and (when included) the XMM cameras, is implicitly formed from a fine set of these images. These representative 
images at four photon energies include all relativistic effects in the likelihood function; note that we extend slightly beyond the 
XMM waveband used for likelihood evaluation in order to render the (relativistic) rotational effects more vividly. Each panel 
is normalized to the maximum phase-average specific intensity over sky direction at that energy. The background sky has the 
same intensity—zero—as neighborhoods of the image that a hot region never traverses because the surface exterior of the hot 
regions is not explicitly radiating in the models (please see Section ). For animated (photon) specific intensity sky maps 
corresponding to these four variants (two posterior mode variants for each of two likelihood function variants), together with 
pulse-profile traces and spectral evolution, please refer to the online journal. 
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is deformed into an oblate shape. While the shape func- 
tion that we use is an approximation, Silva et al. (2021) 
have shown that it is sufficiently accurate for all of the 
rotation-powered pulsars with spin frequencies less than 
400 Hz that NICER observes. 

The radius inferred for PSR J0740--6620 is very sim- 
ilar to the radius inferred from pulse-profile modeling 
of NICER data for PSR J00304-0451, although for the 
latter the inferred mass was lower: Riley et al. (2019) 
found Req = 12.717136 km and M = 1.34491} Mo; 
the independent analysis of Miller et al. (2019) found 
Req = 13.02'124 km and M = 1.44038 Mo. Note 
that the width of the credible interval on the radius for 
PSR J0740+6620 is not smaller, despite the constrain- 
ing mass prior: this is due to the lower number of source 
counts, which limits the precision of the radius measure- 
ment. The radius that we report for PSR J0740+6620 is 
also consistent with the values inferred from the most re- 
cent phase-averaged spectral modeling of quiescent and 
bursting neutron stars (Nättilä et al. 2017; Steiner et al. 
2018; Baillot d'Etivaux et al. 2019; González-Caniulef 
et al. 2019, noting that for some of these analyses a 
neutron star mass is assumed rather than inferred or 
known in advance), and with indirect constraints from 
the inner radii of accretion disks (Ludlam et al. 2017). 

'The detection of gravitational waves from binary neu- 
tron star mergers provides an alternative method of con- 
straining the dense matter EOS. A measurement of tidal 
deformability and neutron star masses from the late in- 
spiral phase can be used to infer EOS parameters and 
hence the associated mass-radius relation. The poste- 
riors on any mass-radius relation derived in this way 
depend on the EOS model and the priors on the model 
parameters, and this should be kept in mind when com- 
paring them to the radii inferred directly (without refer- 
ence to EOS models) from pulse profile modeling (Greif 
et al. 2019). Nevertheless the radii derived from the 
tidal deformability of the binary neutron star merger 
GW170817 are (for a range of EOS models) lower then 
the value we derived for PSR J07404-6620 (see for ex- 
ample Abbott et al. 2018, 2019; De et al. 2018; Most 
et al. 2018; Landry et al. 2020; Essick et al. 2020; Li 
et al. 2020). However, the credible intervals on the 
gravitational-wave derived radii are of similar extent, 
and thus the results are certainly consistent with those 
derived by NICER. 

As is clear from Figure 13, the radius of 
PSR J07404-6620 is in the center of the range consid- 
ered plausible for neutron stars ~ 2 Mo, and appears 
compatible with a wide range of EOS models (see e.g. 
Hebeler et al. 2013; Greif et al. 2019). However full 
Bayesian inference of EOS models is required to fully 
quantify the constraints arising. For this we refer the 
reader to the companion paper by Raaijmakers et al. 
(2021), which carries out EOS inference using results 
from NICER both individually and in combination with 
constraints from gravitational wave observations and 


their electromagnetic counterparts. Using two different 
high density EOS parameterizations, and models that 
connect to microscopic calculations of neutron matter 
from chiral effective field theory interactions at nuclear 
densities, Raaijmakers et al. (2021) show that the new 
NICER results provide tight constraints, for example on 
the pressure of neutron star matter at around twice sat- 
uration density. 

'The measurement of radius for a high mass neutron 
star is also interesting for the properties of potential 
quark cores (see for example Annala et al. 2020). Han & 
Prakash (2020) consider the implications of such a mea- 
surement for our understanding of quark matter phases 
in neutron stars for different model types: self-bound 
strange quark star models; hybrid star models with dif- 
ferent types of phase transition; and third family mod- 
els where two branches with different radii are possible 
for the same mass (Schertler et al. 2000). Our upper- 
and lower-bounds on the radius posterior at high mass 
disfavor some regions of quark matter parameter space: 
both the stiffness of the strange quark phase and the 
transition properties. The inferences are moreover quite 
restrictive for self-bound strange quark stars and third 
family stars, both of which typically have low radii at 
high mass. 

We can also look at the implications of the change 
in radius as one moves from M ~ 2.0 Mọ to M ~ 
1.4 Mo, an important distinguishing characteristic of 
different EOS models (Greif et al. 2019; Han & Prakash 
2020; Xie & Li 2020; Drischler et al. 2021). Generally, 
hadronic EOSs having symmetry energy parameters in 
the ranges predicted by nuclear mass fits and neutron 
matter studies and with Mmax S 2.2 Mo would result 
in AR = Roo — Rı4 S —1 km. The above studies 
show that matter with a phase transition around 2n; 
to a relatively soft phase with sound speed squared 
c2 ~ 1/3 (such as to non-interacting quark matter) 
would also result in AR < —1 km. Such models also 
have Mmax < 2.2 Mo. In contrast, larger values of 
AR < 0.5 km suggest either stiffer high-density mat- 
ter without a phase transition having Mmax ~ 2.3 — 2.5 
Mo, or a transition to a relatively stiff phase at a tran- 
sition density > 2.6n;4. (Drischler et al. 2021). Even 
larger values of AR > 0.5 km would imply a transi- 
tion at a lower density < 2ng44. to similarly stiff mat- 
ter. The companion paper by Raaijmakers et al. (2021), 
which utilizes parameterized EOS models constrained 
by theories of neutron matter, together with observa- 
tions of pulsar masses, gravitational waves from merg- 
ers, and the X-PSI NICER results for PSR J0030--0451 
and PSR J0740+6620, infers that AR ~ —0.5*12 km 
averaged over EOS models. This is consistent with the 
direct observational value Rjo7z49 — Rjo030 = —-0.3*12 
km (this paper, Riley et al. 2019). Although values of 
AR « —1 km and AR > +1 km cannot be ruled out, 
these results suggest more moderate values of AR that 
favor relatively stiff dense matter with a large Mmax or 


A NICER VIEW OF THE MASSIVE PULSAR PSR J07404-6620 25 


o 


(y/Req)y 1 — rs/Req 


—1 [0.33 cycles] 


(XIReq) V 1 — rs/Req (X/Req) V 1 — rs/Req (X/Req) V 1 — rs/Req 


Figure 10. A summary of the animated figure available in the online journal. In the animated figure, the top three panels 
show the (photon) specific intensity as a function of sky direction at three different photon energies as the star rotates. The 
bottom-left panel displays the (photon) specific flux pulse-profiles traced out by the skymaps, each normalized to its respective 
maximum. The bottom-right panel displays the (photon) specific flux spectrum, where the energy bounds each correspond to 
a skymap energy, as does the vertical line; the trace of the vertical line intersecting the spectrum is one of the pulse profiles. 
The star rotates 16 times during the 48 second animation. In this summary figure we aim to display the gravitationally lensed 
geometric configuration of the surface hot regions from our Earthly viewing perspective, over the course of one rotational cycle. 
We display the (photon) specific intensity as a function of sky direction at the lowest energy, as the star rotates through the 
panels from left to right and from top to bottom. 
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Figure 11. Posterior-expected pulse-profiles conditional on the ST-U hot region model and either the NICER likelihood function 
(left panels) or the NICER and XMM likelihood function (right panels). We show the signal incident on the telescopes (top 
and top-center panels) and as registered by NICER (bottom-center and bottom panels). The signal in the top panels has 
been integrated over the linearly-spaced instrument energy intervals, and is effectively proportional to the photon specific flux. 
The black rate curves are the posterior-expected signals generated by the hot regions in combination. We also represent the 
conditional posterior distribution of the incident photon flux (top-center panels) and the NICER count rate (bottom panels) at 
each phase as a set of one-dimensional highest-density credible intervals, and connect these intervals over phase via the contours; 
these distributions are denoted by m(photons/cm?/s;Q) and m(counts/s;$). Note that the fractional width of the credible 
interval at each phase is usually higher for 7(photons/cm?/s; o) than for «(counts/s; 9) because of the variation permitted for 
the instrument model; in combination, the signal registered by the instrument is more tightly constrained. To generate the 
conditional posterior bands we apply the X-PSI package, which in turn wraps the fgivenx (Handley 2018) package. 


an EOS with stiffening at a density 2—3ngat, which could ulations, is also a function of the EOS. Currently feasi- 
result from a first order phase transition or a crossover ble EOS models would permit a maximum neutron star 
transition like that due to the appearance of quarkyonic mass in the range 2— 3 Mo; but stiffer EOS, with larger 
matter (McLerran & Reddy 2019). Observational upper radii, are required to achieve higher masses. Analysis 
limits to Mmax, such as the value Mmax SS 2.2 — 2.3 Mo of the electromagnetic counterpart of the binary neu- 
suggested by GW170817 (Margalit & Metzger 2017), tron star merger GW170817 has suggested a maximum 
could help distinguish these possibilities. neutron star mass somewhere in the range 2.0 — 2.3 Mo 

'The maximum mass of neutron stars, and hence the (Margalit & Metzger 2017; Rezzolla et al. 2018; Ruiz 


boundary between the neutron star and black hole pop- et al. 2018; Shibata et al. 2019), but there is a strong 
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Figure 12. Posterior-expected spectra conditional on the ST-U hot region model and either the NICER likelihood function 
(left panels) or the NICER and XMM likelihood function (right panels). We show the spectrum that would be incident on 
the telescopes if it were unattenuated by the interstellar medium (top panels) and the signal as registered by NICER (center 
and bottom panels). The black rate curves are the posterior-expected spectra generated by the hot regions in combination. We 
represent the conditional posterior distribution z(photons/keV /cm? /s; E) of the unattenuated incident photon specific flux at 
each energy as a set of one-dimensional highest-density credible intervals, and connect these intervals over phase via the contours 
(top panels); the energies displayed are those spanning the waveband of the NICER channel subset [30, 150). In the center panels 
we display the background-marginalized posterior-expectation of the source count-rate signals, plus the background count-rate 
terms that maximize the conditional likelihood functions; the center-right signal is equivalent to that displayed in the center 
panel of Figure 6. In the bottom panels we display the posterior-expected count-rate spectra generated by the hot regions in 
combination and individually, together with the conditional posterior NICER count rate distribution z(counts/s; channel) for 
each channel. Moreover, the topmost green step functions are the phase-average of the center panels—each is effectively, but 
not exactly, the observed count-number spectrum divided by the total NICER exposure time. 


dependence on how the kilonova is modeled. Neverthe- 4.2. Constraining power of XMM-Newton 


less this range is consistent with the relatively soft EOS The likelihood function given NICER and XMM data 
inferred from the tidal deformability for GW170817 (see sets is dominated by the information from the former. 
above). The recent detection of GW190814, a binary However, longer XMM exposure times naturally yield 
compact object merger involving an object with mass greater constraining power. A deep exposure exists 


~ 2.6 Mo (Abbott et al. 2020), is however intriguing. for the rotation-powered millisecond PSR. J0030--0451 
There is considerable debate over whether this object (Bogdanov & Grindlay 2009), and as suggested by Ri- 
could be a high-mass neutron star rather than a low ley et al. (2019), the associated spectroscopic (and tim- 


mass black hole (Fattoyev et al. 2020; Sedrakian et al. ing) event data can be jointly modeled with the NICER 
2020; Huang et al. 2020; Drischler et al. 2021; Tan et al. event data set to address the open question regarding 
2020; Tsokaros et al. 2020; Dexheimer et al. 2021; Zhang the contribution of the model hot regions to that set of 
& Li 2020b; Tews et al. 2021; Godzieba et al. 2021) and events, which Miller et al. (2019) and Riley et al. (2019) 
still be consistent with GW170817. The radius that we inferred to be (close to) minimal. The term minimal is 
have inferred for PSR J07404-6620 suggests a lower max- used to mean that the hot regions contributed only to 
imum mass, however (see Raaijmakers et al. 2021). 
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Figure 13. Mass versus equatorial radius for several ex- 
ample EOS models from Hebeler et al. (2013), showing the 
difference between non-rotating stellar models and stars ro- 
tating at 346 Hz. For each EOS shown the right hand 
(heavier) curve is for a spin of 346 Hz, while the left-hand 
(lighter) curve is for zero rotation. The 68% and 9596 credi- 
ble regions for mass and radius inferred from our analysis of 
PSR J0740+6620 are shown by the shaded cyan contours. 
The blue crosshair shows the inferred median values. 


the pulsed component of the pulse profile and did not 
contribute to the unpulsed component. This present 
Letter offers a demonstration of how this can be exe- 
cuted, albeit with a contribution from XMM that is less 
informative than the contribution from NICER. 

The XMM data set for PSR J07404-6620 is phase- 
averaged and sparse in terms of overall counts, which 
renders it less constraining than the NICER data set. 
However, being an imaging telescope, XMM facilitates 
better quantification of the contribution from the star 
(attributed to hot regions in our models) compared 
to the background, whereas the NICER background is 
more difficult to constrain both a priori and a posteri- 
ori. The contribution of the hot regions to the unpulsed 
component of the NICER pulse-profile is constrained by 
jointly modeling the NICER and XMM event data. 

For PSR J0740--6620, the contribution from the hot 
regions is not minimal. Even when one considers only 
the NICER data set, the hot regions are inferred to gen- 
erate not only the pulsed component but also part of the 
unpulsed component of the pulse profile. The effect of 
including the XMM data in the analysis is to increase 
the amplitude of the emission from the hot regions by 
reducing the contribution of the hot regions to the un- 
pulsed component of the pulse profile. Smaller radius 
stars have larger gravitational fields and cause stronger 
gravitational lensing. The lensing makes it possible to 


see the hot regions for a larger fraction of the spin pe- 
riod, the resulting signal has a lower pulsed fraction than 
the signal from a larger star with less lensing. The sam- 
ples where the NICER-only unpulsed signal is brighter 
are those where PSR J0740--6620 is more compact; by 
weighting away from these, the inclusion of XMM data 
pushes the posterior towards less compact stars, where 
the radius is higher (see also the discussion in Miller 
2016). 

An interesting question is whether the analysis pre- 
sented in this Letter tells us anything about the effect 
that a full joint inference analysis might have on the ra- 
dius inferred for PSR J00304-0451. The emission from 
the hot regions for PSR J00304-0451 conditional on only 
NICER event data was minimal, meaning the unpulsed 
component of the combined signal from the hot regions 
was small. It follows that the inclusion of the XMM 
constraints can only increase the contribution from the 
hot regions. However, the magnitude of the increase in 
brightness and the effect on the inferred radius is hard 
to predict because several parameters in the model are 
degenerate and changes in radius can be offset by, e.g., 
changes in hot region parameters. 

Finally, we explore sensitivity to prior information 
about the NICER and XMM energy-independent ef- 
fective area scaling factors. We importance-sample our 
joint NICER and XMM posterior to compress the joint 
prior on these scaling factors, as shown as Figure 14. 
The compressed joint prior approximates published tele- 
scope calibration uncertainties (see Section 2.4) by us- 
ing telescope-specific scaling factors, each with a Gaus- 
sian prior whose standard deviation is 3%. By com- 
pressing the joint prior, the marginal posterior distri- 
bution of the XMM scaling factor median shifts from 
~ 0.93 up to ~ 0.98. Consequently, to conserve the 
normalization of the high-likelihood count number spec- 
tra registered by each XMM camera—which a posteri- 
ori have larger typical effective areas after compressing 
the prior—the brightness of the signal incident on the 
telescope must decrease. It follows that subject to con- 
serving the pulsed component of the combined signal 
from the hot regions as required by the NICER event 
data, the brightness of the unpulsed component of the 
combined signal decreases as the high-likelihood regions 
of parameter space shift to slightly less compact stars— 
and thus to slightly higher radii given the informative 
mass prior. The overall shift in the posterior PDF of 
the radius due to the compression (~ +0.3 km in the 
median, to R = 12.717)33 km) is much smaller than 
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Figure 14. One- and two-dimensional marginal posterior 
distributions of the NICER and XMM energy-independent 
effective area scaling factors ax 1 and apn respectively, and 
the equatorial radius. The XMM scaling factor apn is shared 
by the three cameras and is denoted by axmm in Table 1. 
The blue NICER x XMM posterior is shown in Figures 7 
as the headline posterior; the properties of this posterior are 
reported in Table 1. The red NICER x XMM | EAPC is 
conditional on a compressed effective area prior that aims to 
approximate the telescope calibration uncertainties discussed 
in Section 2.4; the acronym EAPC simply means effective 
area prior compression. The prior PDFs are displayed as 
the dash-dot distributions in each on-diagonal panel. 


the measurement uncertainty®®. However it highlights 
that instrument cross-calibration is an important aspect 
of these analyses that warrants careful treatment. 
Obtaining estimates of the absolute effective area 
of an X-ray instrument is a challenging task. Cross- 
calibration efforts by the International Astrophysical 
Consortium for High Energy Calibration (IACHEC) us- 
ing observations from multiple concurrent X-ray tele- 
scopes have found offsets typically within +10% but 
with occasional discrepancies reaching up to ~ 20% (see, 


e.g., Ishida et al. 2011; Plucinsky et al. 2017; Madsen 
et al. 2017). The tails of the joint posterior PDF of 
the effective scaling parameters (see Figure 14) go be- 
yond what these calibration measurements indicate, so 
the resulting radius credible intervals should be taken as 
conservative estimates. 


4.2.1. NICER and XMM-Newton backgrounds 


The NICER background is difficult to estimate di- 
rectly, but there are two tools available. Figure 15 shows 
the NICER background estimated using the 'space 
weather’ model (Gendreau 2020) and the ‘3C50’ model 
(Remillard et al. 2021), see also Bogdanov et al. (20192). 
'The former models background due to the space weather 
environment, which varies as NICER moves through dif- 
ferent geomagnetic latitudes, and depends on solar ac- 
tivity. The :3C50' model is empirical, taking into ac- 
count two different types of particle-induced events, the 
cosmic X-ray background, and a soft X-ray noise com- 
ponent related to operation in sunlight. We also render 
a set of NICER background estimates for comparison: 
using joint NICER and XMM posterior samples, we dis- 
play the NICER background spectrum that maximizes 
the conditional likelihood function, yielding a band that 
is a proxy for background variable posterior mass. 

'The background spectra displayed in Figure 15 ex- 
ceed the space weather model at low energies. This 
excess is not unreasonable given the presence of mul- 
tiple other point sources in the NICER field of view.?? 
In higher channels, the background spectra appear to 
agree well with the space weather model. There is a 
possible small systematic under-prediction compared to 
the space weather model in channels 80—100, the level 
of agreement is satisfactory given the current uncertain- 
ties on the background modeling. The level of agree- 
ment with the ‘3C50’ model, which exceeds the space 
weather model at low energies and is consistent with 
it at higher energies, also appears good. Neither back- 
ground model includes off-axis X-ray sources that might 
contaminate the target spectrum, and therefore inferred 
backgrounds that exceed the two estimates are not in 
principle a problem. 

Once the uncertainties on the two NICER background 
models are understood more fully, we anticipate being 
able to use them to reduce systematic error in the pulse- 
profile modeling. The current radius measurement con- 
ditional on the XMM data set—which yields an indirect 


38 Such a small shift is not expected to have any remarkable effect 
on EOS inference, given typical EOS priors (Greif et al. 2019). 
This is demonstrated in Raaijmakers et al. (2021), where EOS 
inference is carried out using both our NICER-only inferred ra- 
dius and our NICER x XMM inferred radius. Despite an overall 
change in the median radius posterior inferred from the pulse 
profile modelling ~ +1.1 km once the XMM data set is included, 
the mass-radius band shifts by a much smaller amount than this, 
due to the strong influence of the priors on the EOS model. 


39 The NICER background variables in principle also capture phase- 
invariant emission from the environment of PSR J0740+6620 
that does not originate from the surface hot regions in the X- 
PSI model. The XMM background prior information is con- 
servative (see Section 2.5.2) and can also in principle capture 
such phase-invariant emission. The XMM likelihood function is 
not purely marginalized with respect to an informative blank- 
sky background prior, which could attribute too much emission 
to the hot regions in the X-PSI model. 
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NICER background constraint—will therefore be super- 
seded. Efforts are also ongoing to quantify the level of 
background due to any off-axis sources in the field of 
view. 

Fig. Set 15. NICER background. 

In Figure 16 we display the XMM pn background 
spectra that maximize the conditional likelihood func- 
tion given nested samples from the joint NICER and 
XMM posterior, together with supplementary informa- 
tion. Graphical checking of the spectra against the 
blank-sky estimate and the event data does not reveal 
any problems a posteriori. 

Fig. Set 16. XMM-Newton background. 


4.3. Hot region configuration 


'The hot region configuration is assumed to be related 
to the star's magnetic field structure. In our previous 
analysis of PSR J0030--0451, the superior model was 
ST+PST: one of the hot regions was a small spherical 
cap and the other a long extended arc. A configuration 
in which the hot regions were antipodal was strongly 
disfavored. Although the hot regions were separated by 
approximately 180? in longitude, both were in the same 
hemisphere of the star. 

An antipodal configuration is also disfavored for 
PSR J0740--6620, once again arguing against a simple 
dipolar model (although the configuration is closer to 
antipodal than it is for PSR J00304-0451). The location 
of the emitting regions is however rather different. The 
expected number of counts contributing to the total ex- 
pected NICER signal for PSR J07404-6620 is not (close 
to) minimal a posteriori, despite the diffuse XMM likeli- 
hood function (Section 4.2). Only one of the hot regions 
vanishes from sight during the rotational cycle; the other 
remains visible at all times. For PSR J0030+0451, the 
expected number of counts contributing to the total ex- 
pected NICER signal is (close to) minimal a posteriori 
for all models that passed graphical posterior predictive 
checking (Riley et al. 2019); in all cases the hot regions 
were inferred to dance around the stellar limb, each be- 
ing entirely non-visible for a substantial fraction of a 
rotational cycle. 

We considered a range of shapes for the hot regions, 
from circles to rings and arcs. For PSR J0740+6620 
the ST-U model (in which both hot regions are circles) 
provides a reasonable description of the data in terms 
of e.g. residuals. A more complex model, ST+PST (the 
superior model for PSR J0030--0451) did not offer any 
improvement in model quality measures nor did it lead 
to changes in the inferred radius or hot region geome- 
try. For PSR J0030--0451 ST-U provided a reasonable 
description of the data, but we were sensitive a posteriori 
to additional complexity in the structure of one of the 
hot regions; consequently, a large shift in the inferred ra- 
dius was reported. No extended arc structure is inferred 
for the secondary hot region for PSR J07404-6620, al- 
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Figure 15. A comparison of the NICER background con- 
ditional on the NICER likelihood function to the NICER 
background conditional on the NICER and XMM likelihood 
function. The blue step function is the total NICER count 
spectrum, assumed to be generated by the surface hot re- 
gions and the phase-invariant background in our modeling. 
The solid black step function is the background that maxi- 
mizes the conditional likelihood function given the parameter 
vector associated with the nested sample reporting the high- 
est value of the background-marginalized likelihood function. 
The orange step functions (of which there are 10?) that form 
a band are defined similarly, but each is conditional on a sam- 
ple from the joint NICER and XMM posterior. To ensure 
tractability, we marginalize over background parameters in 
order to define a sampling space with O(10) dimensions; it 
follows that we cannot estimate marginal posterior PDFs 
from our posterior samples for each background variable due 
to information loss. Strictly, the orange band should there- 
fore not be interpreted as a collection of posterior PDFs—one 
per background variable—but are indicative of where poste- 
rior mass will be concentrated. We compare the backgrounds 
to estimates of the NICER background generated using the 
NICER Space Weather background estimation tool (Gen- 
dreau 2020) and the :3C50' model (Remillard et al. 2021). 
We provide a supplementary figure that shows the NICER 
background for the 10? highest-likelihood nested samples, 
given the NICER and XMM likelihood function; we also 
provide a figure that shows the NICER background for a set 
of 10? posterior samples after compression of the joint prior 
PDF of the telescope effective areas (see Figure 14 and asso- 
ciated text). The complete figure set (3 images) is available 
in the online journal. 


though the hot region could well be a ring instead of a 
spherical cap. 
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Figure 16. We add to Figure 4 the XMM pn background 
spectra that maximize the conditional likelihood function 
(denoted by MCL in the legend) given nested samples from 
the joint NICER and XMM posterior, but subject to the 
prior support of the background variables. That is, if the 
maximum of the conditional likelihood function is not within 
the prior support, the nearest value of the background to the 
maximum that is within the prior support is used in the spec- 
trum. We also show the total spectra as the sum of the XMM 
pn source spectra (given the nested samples from the joint 
NICER and XMM posterior) and the XMM pn background 
spectra that maximize the conditional likelihood function. 
The complete figure set (3 images) for the three XMM cam- 
eras is available in the online journal. 


The pulse profile modeling presented in this work con- 
strains the location and shape of the hot regions on the 
neutron star surface. These regions arise either via heat- 
ing by magnetospheric currents (Kalapotharakos et al. 
2021), or through complex magneto-thermal evolution 
in the stellar crust (De Grandis et al. 2020). Thus, the 
information obtained can be used as input for model- 
ing magnetic field structure both in the magnetosphere 
and inside the star, however, there are currently many 
unknowns in the picture. 

Qualitatively, the pulsars which spin faster have more 
compact magnetospheres and larger (and more complex 
if the field has a substantial non-dipolar component) 
open field line regions. If heating happens at the open 
field line footprints, then one would expect heated re- 
gions of PSR J0740+6620 (with the ratio of light cylin- 
der to neutron star radii, Rgc/Rys = 11.1) to be larger 
than those of PSR J0030+0451 (Rr,c/ Rxs = 18.3), pro- 
vided that both pulsars have field configurations of simi- 
lar complexity (i.e. similar relative magnitude of higher- 


order components), contrary to what is being inferred 
from the data. 

Detailed modeling of pulsar magnetic fields similar 
that performed by Kalapotharakos et al. (2021), to- 
gether with an analysis of crustal thermal evolution 
would be interesting from an evolutionary point of view. 
PSR J0740--6620 has a white dwarf companion while 
PSR J0030+0451 is solitary and the difference in re- 
cent accretion history may play a role in field configura- 
tion and residual heating pattern. Their masses are also 
substantially different, and according to the population 
study by Antoniadis et al. (2016), such a large difference 
cannot be attributed to accretion alone and must stem 
partly from the difference in progenitor properties. 


4.4. Analysis cross-check 


An independent analysis carried out within the 
NICER collaboration by Miller et al. (2021) reports a 
PSR J07404-6620 radius of 13.71*7 92 km, derived from 
their combined NICER and XMM analysis. The 68% 
credible intervals overlap with those that we report in 
this Letter, but the differences deserve some explana- 
tion. Recall that our pulse-profile modeling involves 
several elements: the NICER phase-resolved data set; 
the XMM phase-averaged data set; a model for the gen- 
eration of the count data (including priors on the model 
parameters); and statistical samplers. 

Let us first focus on the analysis of the NICER data. 
The two teams make different choices on what energy 
channels to include in the NICER data set: we use chan- 
nels [30,150) whereas Miller et al. (2021) use channels 
[30,123] (although Miller et al. (2021) report that includ- 
ing higher channels does not lead to notable changes in 
their results). 

'The two teams also make a number of different prior 
choices. Miller et al. (2021) assume priors on the mass, 
distance, and inclination with larger spread to account 
for potential systematic error on top of the values re- 
ported by Fonseca et al. (2021a). Miller et al. (2021), 
unlike us, do not impose a hard upper-limit on the 
prior support of the radius (see Section 2.2.1): they as- 
sume a flat prior on the reciprocal of the compactness 
Reg /rg(M) ~ U(3.2,8.0).4° We define the prior support 
so as to exclude hot-region exchange degeneracy—thus 
halving the number of posterior modes—whereas Miller 
et al. (2021) do not exclude exchange degeneracy. We 
also condition on different prior PDFs of the hot region 
center colatitudes and effective temperatures: the prior 
PDFs of our hot region center colatitudes are isotropic;*! 
and our prior PDFs of the logarithms of the effective 
temperatures are uniform. Finally, we use a marginal 


40 The absence of prior support for high radii is effectively incorpo- 
rated at a later stage, in the EOS analysis carried out by Miller 


et al. (2021). 


41 Meaning uniform in the cosine of the hot region center colatitude. 
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prior distribution of the energy-independent NICER ef- 
fective area scaling factor that has a larger spread and 
broader prior support than Miller et al. (2021); as we 
discuss below, however, this is not thought to be impor- 
tant. Our prior PDFs are defined in Table 1. 

The two teams implement different statistical sam- 
pling protocols. We use nested sampling (MultiNest) 
with high-resolution settings, whilst Miller et al. (2021) 
use a hybrid nested sampling (MultiNest) and parallel- 
tempering ensemble MCMC scheme; Miller et al. (2021) 
use far more core hours during the ensemble sampling 
phase of their computations than during the nested sam- 
pling phase. Where both teams use MultiNest, our res- 
olution settings are higher:?? we use up to 8 x 104 live- 
points with a bounding hypervolume expansion factor 
of 10, and we eliminate hot-region exchange degeneracy. 
For the same target distribution, using a higher num- 
ber of live-points and eliminating hot-region exchange 
degeneracy (and thus the halving the number of modes) 
both yield lower likelihood-isosurface bounding approx- 
imation error; a higher number of live points also yields 
finer sampling of the distribution. Potential variation 
arising from sampler choice is nicely illustrated in the 
exploratory study by Ashton et al. (2019), which inves- 
tigates the effect on estimation of parameters of gravi- 
tational wave signals from compact binary coalescences. 

These different modeling choices lead to some minor 
differences in the reported spreads of the radius posterior 
PDFs conditional on NICER data. However, the degree 
of overlap is high: we find R = 11.20* 12? km (see Table 
2); Miller et al. (2021) find R = 11.51*1 57 km. 

The radius posterior PDF differences become more 
pronounced once the XMM data set is included. Our 
posterior is shifted down in radius relative to the Miller 
et al. (2021) posteriors which extend to higher radii. 
Once again, the two groups make a number of differ- 
ent choices that have more of an impact for a smaller 
data set. We use different formulations of the prior on 
the XMM background: Miller et al. (2021) use a dis- 
tribution based on the assumedly Poissonian observed 
numbers of blank-sky counts whereas we use a flat prior 
as described in Section 2.5.2. Our posterior is also con- 
ditional on a broader prior for the cross-calibration un- 
certainty of the two instruments than Miller et al. (2021) 
(who restrict the maximum relative calibration offset to 
+10 %); this permits lower inferred radii in our analysis 
(see Section 4.2, where we study the effect of narrowing 
the cross-calibration uncertainty). And as already men- 
tioned, Miller et al. (2021) are more agnostic in terms 


42 A caveat is that the performance of nested sampling with given 
resolution settings, on the same target distribution, is also depen- 
dent on the structure of the likelihood function in the native sam- 
pling space—the native sampling space is not however unique be- 
cause different transformations can be defined to inverse-sample 


a particular joint prior PDF. 


of priors on the radius (allowing R > 16 km): the XMM 
likelihood function permits larger radii (see Figure 7) 
and hence their posterior PDF extends accordingly to 
higher radii. 

Despite these differences, the inclusion of the XMM 
data is still extremely valuable, because in both anal- 
yses there is a consistent increase in radius, with the 
lowest radii being ruled out. Moreover, there are good 
prospects for improving this: once the uncertainties on 
the NICER background estimates mentioned in Section 
4.2.1 are clear, we anticipate being able to use those to 
supplement the indirect constraint on the NICER back- 
ground provided by the XMM data set. 


5. CONCLUSIONS 


We have derived a posterior distribution of the ra- 
dius of the massive rotation-powered millisecond pulsar 
PSR J07404-6620, conditional on NICER XTI pulse- 
profile modeling, joint NANOGrav and CHIME/Pulsar 
wideband radio timing, and XMM EPIC spectroscopy. 
The radius that we infer for PSR J0740+6620 is 


12.391730 km with an inferred mass (dominated by the 


radio-derived prior) of 2.072*0067 Mo. A measurement 


of radius for such a high-mass pulsar should provide 
a strong constraint on dense matter EOS models (see 
Raaijmakers et al. 2021), with the derived radius favor- 
ing models of intermediate stiffness. We anticipate be- 
ing able to improve this measurement in the near future 
thanks to the ongoing development of detailed models 
of the NICER background. This will be incorporated 
into future pulse-profile modeling, improving upon the 
current indirect constraint provided by the XMM data 
set. 

Pulse-profile modeling also enables us to infer the 
properties of the X-ray emitting hot regions, which are 
assumed to be linked to the magnetic field structure. 
'The two hot regions are not antipodal, arguing against 
a simple dipole magnetic field. There is however no evi- 
dence for extended crescents, as indicated by pulse pro- 
file modeling for PSR J0030--0451 (Riley et al. 2019); 
simple circular hot regions (spherical caps) suffice to de- 
scribe the PSR J07404-6620 data adequately. How this 
relates to the evolutionary history of the two sources 
remains to be determined. 

The analysis presented here also includes improve- 
ments to our pulse-profile modeling methodology and 
software, most notably the ability to include (in this 
case) phase-averaged X-ray data from XMM EPIC. For 
PSR J0740--6620, the inclusion of this data set led to 
a remarkable shift in the inferred radius. We have also 
investigated the sensitivity to uncertainties in instru- 
mental cross-calibration, an area where we may be able 
improve our modeling in the future. XMM EPIC data 
sets exist for other NICER pulse-profile modeling tar- 
gets, including the source analyzed in Riley et al. (2019), 
PSR J00304-0451. In that analysis the XMM EPIC data 
was used retrospectively, as a check on the consistency 
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of the inferred model a posteriori; more formally this 
information should be used to form a likelihood func- 
tion that is a product of likelihood function factors over 
telescopes, as in this present work. In future work, we 
will use the improved pipeline presented in this Letter 
to perform joint analysis of the NICER and XMM data 
sets for PSR J00304-0451 and other sources. 
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Facility: NICER XTI (Gendreau et al. 2016), 
NANOGrav, Green Bank Telescope, CHIME/Pulsar, 
XMM-Newton EPIC. 


Software:  Python/C language (Oliphant 2007), 
GNU Scientific Library (GSL; Gough 2009), 
NumPy (van der Walt et al. 2011), Cython (Behnel et al. 
2011), SciPy (Jones et al. 2001-), OpenMP (Dagum 
& Menon 1998), MPI (Forum 1994), MPI for 
Python (Dalcín et al. 2008), Matplotlib (Hunter 2007; 
Droettboom et al. 2018), IPython (Perez & Granger 
2007), Jupyter (Kluyver et al. 2016), TEMPO2 (photons; 
Hobbs et al. 2006), PINT (photonphase; https:// 
github.com/nanograv/PINT), MULTINEST (Feroz et al. 
2009), PvMurrINEST (Buchner et al. 2014), Get- 
Dist (Lewis 2019, https://github.com/cmbant/getdist), 
nestcheck (Higson 2018; Higson et al. 2018, 2019), 
fgivenx (Handley 2018), NICERsoft (https://github. 
com/paulray/NICERsoft), X-PSI v0.7 (https://github. 
com/ThomasEdwardRiley/xpsi; Riley 2021). 
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APPENDIX 


Table 1. Summary table for ST-U NSX fully-ionized hydrogen hot regions plugged into the NICER x XMM likelihood function, 
conditional on the NANOGrav x CHIME/Pulsar prior PDF. The description in this caption is largely adopted from Riley 
et al. (2019) for consistency. We provide: (i) the parameters that constitute the sampling space, with symbols, units, and short 
descriptions; (ii) any notable derived or fixed parameters; (iii) the joint prior distribution, including hard truncation bounds 
and constraint equations that define the hyperboundary of the support; (iv) one-dimensional (marginal) 68.396 credible interval 
estimates symmetric in posterior mass about the median (Cleax); (v) KL-divergence estimates in bits (Dx) representing prior- 
to-posterior information gain (see the appendix of Riley et al. (2019) for high-level description of the divergence); and (vi) 
the parameter vector (ML) estimated to maximize the background-marginalized likelihood function, corresponding to a nested 
sample. Note that strictly, the target of nested sampling is not to generate a maximum likelihood estimator—it is a by-product 
of the sampling process for evidence estimation. Moreover, there is degeneracy in the likelihood function and high-likelihood 
solutions with remarkably different parameter values—such as a radius near or above the posterior median—may be retrieved 
from the public sample information. Constraint equations in terms of two or more parameters result in marginal distributions 
that are not equivalent to those inverse-sampled. 


Parameter Description Prior PDF (density and support) Ula Dki ML 
P [ms] coordinate spin period P = 2.8857,” fixed — — — 
M [Ms] gravitational mass M,cos(i) ~ N(p*, &*) 20721006. 0.01 2.070 
joint prior PDF N(p*, 5*) u* = [2.082,0.0427]" 
. _ [0.07037 0.0131? | 
— 10.0131? 0.00304? 
Reg [km] coordinate equatorial radius Ra ~ U(3rg(1), 16)*4 12.394430 0.58 11.02 
compactness condition“’ Rpolar/Tg(M) > 3 
effective gravity condition*® 13.7 € logyg geg(0) < 15.0, VO 
O, [radians] p region center colatitude cos(05) ~ U(—1,1) 1.38758 0.25 1.622 
O, [radians] s region center colatitude cos(0,) ~ U(—1, 1) 1.809740 0.24 2.303 
dp [cycles] p region initial phase" $p ~ U(—0.5,0.5), wrapped*® bimodal 3.52 0.185 
$s [cycles] s region initial phase? gos ~ U(—0.5,0.5), wrapped bimodal 3.51 0.243 
C» [radians] p region angular radius Cp ~ U(0, 7/2) 0.147000 2.18 0.093 
Gs [radians] s region angular radius Çs ~ U(0, 7/2) 0.14676 015 2.15 0.127 
no region-exchange degeneracy 0,20, 
non-overlapping hot regions function of (Op, Os, bp, bs, Cp, Gs) 


Continued on next page 


43 Cromartie et al. (2020); Wolff et al. (2021). 


44 The function rg(M) := GM/c? denotes the gravitational radius 
with dimensions of length. 


45 The coordinate polar radius of the source 2-surface, 
Rpolar(M, Req, 9), is a quasi-universal function adopted 
from AlGendy & Morsink (2014), where Q :— 2m/P is the 
coordinate angular rotation frequency. 

46 The range of effective gravity from the equator (minimum grav- 

ity) to the pole (maximum gravity) must lie within NSX limits. 

A quasi-universal function is adopted from AlGendy & Morsink 

(2014) for effective gravity geg(0; M, Req, Q) in units of cm s7? 

in the table, where Q := 27/P is the coordinate angular rotation 

frequency. 


47 With respect to the meridian on which Earth lies. 


48 The periodic boundary is admitted and handled by MuLTINEST. 
However, this is an unnecessary measure because we straightfor- 
wardly define the mapping from the native sampling space to the 
space of a phase parameter ¢ such that the likelihood function 
maxima are not in the vicinity of this boundary. 


49 With respect to the meridian on which the Earth antipode lies. 
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Table 1- Continued from previous page 
logio (Tp [K]) p region NSX effective temperature log, (Tp) ~ U(5.1,6.8), NSX limits | 5.088* 0055 2.95 6.080 
logo (Ts [K]) s region NSX effective temperature logio (Ts) ~ U(5.1, 6.8), NSX limits | 5.992* 0057 2.08 6.058 
cos(i) cosine Earth inclination to spin axis | M,cos(i) ~ N(p*, 3*) 0.0424*00029 0.01 0.044 
D [kpc] Earth distance D ~ skewnorm(1.7, 1.0, 0.23)°° | 1.21*0 15 0.10 0.995 
Nu [10??cm ?] | interstellar neutral H column density | Na ~ U(0, 10) LBBT aes 0.91 0.216 
QNICER NICER effective-area scaling QNICER, &xMM ^v N(p, X) Led 0.03 1.111 
QXMM XMM effective-area scaling QNICER, &xMM ^v N(p, ©) 0.93*013 0.17 0.638 


joint prior PDF N(p, X) 


u = [1.0, 1.0] 


0.106? 0.150? 


2 2 
e bs 0.106 


Sampling process information 


number of free parameters;?! 15 
number of processes:?? 1 

number of live points: 4 x 10* 
hypervolume expansion factor: 0.171 
termination condition: 10^! 
evidence? In Z = —20714.61 + 0.02 
number of coret hours: 28320 
likelihood evaluations: 23710136 


nested replacements: 1250386 


effective sample size:?? 420281 


50 


51 


52 


QU c 


Specifically, the PDF defined as scipy.stats.skewnorm.pdf(D, 
1.7, loc=1.002, scale=0.227), truncated to the interval D € 
(0, 1.7] kpc. 

In the sampling space; the number of background count rate vari- 
ables is equal to the number of channels defined by the NICER 
and XMM data sets. 

'The mode-separation MULTINEST variant was deactivated, mean- 
ing that isolated modes are not evolved independently and nested 
sampling threads contact multiple modes. In principle this also 
allows us to combine the processes in a post-processing phase 
using nestcheck (Higson 2018), if more than one process is avail- 
able for a given posterior; the posteriors derived in the produc- 
tion analysis are high-resolution, so we neglect combining repeat 
processes. 

Defined as the prior predictive probability 
p(dn, dx, {Ax} | ST-U). Note, however, that in order to 
complete the reported evidence for comparison to models other 
than those defined in this work, upper-bounds for the NICER 
background parameters need to be specified. 

Intel® Xeon® E5-2697A v4. 

The effective sample size estimator invoked, follow- 
ing DNest4 (Brewer & Foreman-Mackey 2018, https: 
//github.com/eggplantbren/DNest4), is the perplexity mea- 


sure 
T 

ESS := exp (- So wi log w) j 
i 


where the {wi }i=1,...,7 are the sample weights (e.g., Martino et al. 
2017). 


Table 2. Summary table for ST-U NSX fully-ionized hydrogen hot regions plugged into the NICER likelihood function, conditional 


on the NANOGrav x CHIME/Pulsar prior PDF. See the caption of Table 1 and the associated footnotes for details. 


Parameter Description Prior PDF (density and support) Cles% Dx, ML 
P [ms] coordinate spin period P = 2.8857, fixed = = — 
M [Mo] gravitational mass M,cos(i) ~ N(u, £) 2.078'0005 0.01 2.125 

joint prior PDF N(p*, 5*) u* = [2.082,0.0427]" 

s» [0.07037 0.01317 | 
0.0131? 0.00304? 

Req [km] coordinate equatorial radius Req ~ U(3rg(1), 16) 11.297522 0.72 10.90 

compactness condition 13.7 € log;o geg(0) < 15.0, VO 
©, [radians] p region center colatitude cos(05) ~ U(—1,1) 138751. 0.13 0.425 
©, [radians] s region center colatitude cos(0,) ~ U(—1,1) 1.087505 0.13 1.610 
dp [cycles] p region initial phase? p ~ U(—0.5,0.5), wrapped bimodal 3.46  —0.267 
ds [cycles] s region initial phase?" Qs ~ U(—0.5,0.5), wrapped bimodal 3.47  —0.309 
C» [radians] p region angular radius Cp ~ U(0, 7/2) 0.191+9:492 1.69 0.233 
Gs [radians] s region angular radius Cs ~ U(0, 7/2) 0.189+9:495 1.68 0.132 

no region-exchange degeneracy 0,70, 

non-overlapping hot regions function of (Op, Os, dp, bs, Sp, Cs) 
logio (Tp [K]) p region NSX effective temperature logio (Tp) ~ U (5.1, 6.8), NSX limits | 6.014+9:943 2.90 6.068 
logis (Ts [K]) s region NSX effective temperature log, (Ts) ~ U(5.1, 6.8), NSX limits | 6.017*0:065 2.89 6.086 
cos(i) cosine Earth inclination to spin axis | M,cos(i) ~ N(p*, *) 0.0426*00022 0.01 0.045 
D [kpc] Earth distance D ^ skewnorm(1.7, 1.0, 0.23)? | 1.19*0 14 0.08 1.145 
Nu [10??cm ?] | interstellar neutral H column density | Na ~ U(0, 10) qup 1.07 0.029 
QNICER NICER effective-area scaling anicer ~ N(1, 0.152) 0.97*013 0.04 1.083 


Sampling process information 


number of free parameters:?? 14 
number of processes:?? 1 

number of live points: 4 x 10* 
hypervolume expansion factor: 0.17! 
termination condition: 10^! 

number of core?! hours: 24000 
likelihood evaluations: 26858453 
nested replacements: 1208371 


effective sample size: 303905 


56 With respect to the meridian on which Earth lies. 
57 With respect to the meridian on which the Earth antipode lies. 


58 The 


PDF defined as 
loc=1.002, scale=0.227), 


truncated to the 


D € [0,1.7] kpc. 


59 Tn the sampling space; the number of background count rate vari- 
ables is equal to the number of channels defined by the NICER 
data set. 


scipy.stats.skewnorm.pdf(D, 1.7, 
interval 


60 The mode-separation MULTINEST variant was deactivated, mean- 
ing that isolated modes are not evolved independently and nested 
sampling threads contact multiple modes. 


61 Intel® Xeon® E5-2697A v4. 


